
//MARCH/APRIL 2015/ 


Netflix powers through 2 billion content requests 
per day with Java-driven architecture 


By and for the Java community 


IMPROVE THE 
OBSERVABILITY 
OF FORK/JOIN 


THE QUANTUM 
PHYSICS OF JAVA 


ORACLE.COM/JAVAMAGAZINE 


ORACLE 











//table of contents / 


21 

ACTION! 

Netflix moves fast to provide the 
viewing experiences its customers 
want. Java helps make it happen. 



Java and 
Performance 


New theme icon. See how it works. 


COVER ART BY PHIL SALTONSTALL 



27 e 

Java Architect 

IMPROVING THE 
OBSERVABILITY 
OF FORK/JOIN 
OPERATIONS 

How to determine when 
adding parallelism might 
help performance 


COMMUNITY 

03 

From the Editor 

05 

Java Nation 

News, people, books, and events 

M 

JCP Executive Series 

Performance, Innovation, 
and Success 

Intel’s Anil Kumar describes how 
collaboration drives a healthier 
Java ecosystem. 


JAVA IN ACTION 

T2 |J] 

loT Developer Challenge 
Winner 

Control your greenhouse environment 
automatically with Smart Greenhouse. 

JAVA TECH 
33 

Java Architect 

Understanding Java JIT 
Compilation with JITWatch 

See the effects of small source code 
changes and Java HotSpot VM 
switches. 


40 

Java Architect 

The Quantum Physics of Java 

An introduction to modern chip design 
and its effect on Java programs 

47 

Java Architect 

Shakespeare Plays Scrabble 

New ways of solving problems with the 
Stream API in Java SE 8 

57 

Enterprise Java 

Improving the Performance 
of Java EE Applications 

Incorporate performance tuning into your 
development lifecycle. 


51 

Enterprise Java 

BIG DATA AND JAVA 

Our panel of experts 
answers pressing 
questions about working 
with Java and big data. 


65 

Rich Client 

Writing JavaFX Applications 
for Mobile Devices 

With recent JavaFX ports, you can 
now create Java client applications 
for mobile devices. 

69 

Fix This 

Take our Date and Time API 
code challenge! 


ORACLE.COM/JAVAMAGAZINE ////////////////////////////////////////// MARCH/APRIL 2015 
































EDITORIAL 

Interim Editor in Chief 

Kay Keppler 

Community Editor 

Yolande Poirier 

Java in Action Editor 

Michelle Kovac 

Technology Editor 

Tori Wieldt 
Contributing Writer 

Kevin Farnham 
Contributing Editors 

Claire Breen, Blair Campbell, Stephen Chin, 
Kay Keppler, Karen Perkins, Reza Rahman, 
Simon Ritter, James Weaver 

DESIGN 

Senior Creative Director 

Francisco G Delgadillo 

Design Director 

Richard Merchan 

Contributing Designers 

Jaime Ferrand, Diane Murray, 

Arianna Pucherelli 

Production Designers 

Sheila Brennan, Kathy Cygnarowicz 


PUBLISHING 

Publisher 

Jennife r Ha milton +1.650.506.3794 

Associate Publisher and Audience 
Development Director 

Karin Kinnear +1.650.506.1985 

Audience Development Manager 

Jennifer Kurtz 

ADVERTISING SALES 

President, Sprocket Media 

Kyle Walkenhorst +1.323.340.8585 

Western and Central US, LAD, and 
Canada, Sprocket Media 
Tom Cometa +1.510.339.2403 
Eastern US and EMEA/APAC, 
Sprocket Media 

Mark Makinney +1.805.709.4745 

Advertising Sales Assistant 

Cindy Elhai +1.626.396.9400 x 201 

Mailing-List Rentals 

Contact your sales representative. 

RESOURCES 

Oracle Products 

+1.800.367.8674 (US/Canada) 

Oracle Services 

+1.888.283.0591 (US) 

Oracle Press Books 

oracJepressbooks.com 


ARTICLE SUBMISSION 

If you are interested in submitting an article, please e-mail the editors. 

SUBSCRIPTION INFORMATION 

Subscriptions are complimentary for qualified individuals who complete the 
subscription forrn. 

MAGAZINE CUSTOMER SERVICE 
iava@halldata.com Phone +1.847.763.9635 

PRIVACY 

Oracle Publishing allows sharing of its mailing list with selected third parties. If you prefer 
that your mailing address or e-mail address not be included in this program, contact 
Customer Service. 


Copyright © 2015, Oracle and/or its affiliates. All Rights Reserved. No part of this publication may be reprinted or otherwise 
reproduced without permission from the editors. JAVA MAGAZINE IS PROVIDED ON AN “AS IS” BASIS. ORACLE EXPRESSLY 
DISCLAIMS ALL WARRANTIES, WHETHER EXPRESS OR IMPLIED. IN NO EVENT SHALL ORACLE BE LIABLE FOR ANY 
DAMAGES OF ANY KIND ARISING FROM YOUR USE OF OR RELIANCE ON ANY INFORMATION PROVIDED HEREIN. The 
information is intended to outline our general product direction. It is intended for information purposes only, and may not be 
incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied 
upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s 
products remains at the sole discretion of Oracle. Oracle and Java are registered trademarks of Oracle Corporation and/or its 
affiliates. Other names may be trademarks of their respective owners. 

Java Magazine is published bimonthly with a free subscription price by 
Oracle, 500 Oracle Parkway, MS 0PL-3C, Redwood City, CA 94065-1600. 

Digital Publishing by GTxcel 




02 


ORACLE.COM/JAVAMAGAZINE /////// /////////////////////////////////// MARCH/APRIL 2015 


Prove Your 
Tech Creds 

Get Java Certified 



Save up to 20% 

k_J 


/ Get noticed by hiring managers 
Learn from Java experts 

✓ Join online or in the classroom 

✓ 96% of participants recommend it 

>/ 100% report promotions, raises, and more 







































//from the editor / 




ontinually improving Java performance is more important than ever. Not only does the platform 
continue to evolve, but cloud computing magnifies the effect of a single CPU. From configuring the Java Virtual 
Machine (JVM) to using best practices in your application code, you have lots of ways to improve performance— 
and in this issue, we offer several ideas. 




Java and 
Performance 


We begin with an analysis from KirkPepperdine, who examines when adding paral¬ 
lelism using the fork/join framework might help performance. Michael Heinrichs, in 
"The Quantum Phy sics of Java," looks at modern chip design and its effect on Java programs. Chris Newland 
discusses the effects of small source code changes and Java HotSpot VM switches in Java just-in-time (JIT) 
compilation, and Josh Juneau explores incorporating performance tuning into the development lifecycle. Our 
JCP Executive Series features Anil Kumar, a performance architect at Intel. And check out the big data Q&A: 
a panel from JavaOne 2014 answers the audience's most pressing questions. 

Finally, our cover story showcases Netflix, which fulfills 2 billion content requests everyday. Now that's what I 
call performance! The company can do so because most of the services running in its 
architecture are built on Java and the JVM. 

On a personal note, I've been filling in during the search for a new and permanent 
editor in chief. I want to thank the regular staff, who helped me over the rough spots 
and worked especially hard to get this issue out. And now I'm pleased to announce 
that Andrew Binstock, a veteran technology journalist most recently from Dr. Dobb's 
journal, will take over the helm of Java Magazine. Welcome, Andrew! 


//send us your feedback / 

We'll review all 
suggestions for future 
improvements. 
Depending on volume, 
some messages might 
not get a direct reply. 


Kay Keppler, Interim Editor in Chief 




PHOTOGRAPH BY BOB ADLER 



Find the Most 
Qualified Java 
Professionals 
for your 

Company's Future 

Introducing the 
Java Magazine Career 
Opportunities section - 
the ultimate technology 
recruitment resource. 

Place your advertisement 
and gain immediate access 
to our audience of top IT 
professionals worldwide 
including: corporate and 
independent developers, IT 
managers, architects and 
product managers. 


For more information 
or to place your 
recruitment ad or 
listing contact: 
tom.cometa@oracle.com 
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The JCertif Kigali 
Java developer 
conference was 
a huge success, 
sharpening the 
skills of hundreds of 
Java programmers. 
JCertif Inter¬ 
national has plans 
to expand its 
events to more 
African nations. 


JCERTIF 

TRAINING DEVELOPERS 
IN AFRICA 


A winner of the 2014 Duke's Choice Awards, JCertif International 
has trained more than 5,000 new Java developers across Africa 
since its start in 2010. JCertif Kigali and JCertif Tunisia were just 
its latest great successes. JCertif Kigali was the leading devel¬ 
oper community conference in Rwanda. Founder and Managing 
Director Max Bonbhel says, "We're still digesting the great num¬ 
ber of Likes collected during the event." The events are two- to 
five-day Java workshops for groups of 20 to 50 developers. JCertif 
is planning a series of events in 2015 including in Cameroon, 
Benin, Senegal, the two Congos, Ivory Coast, and Rwanda. 


PHOTOGRAPHS COURTESY OF JCERTIF INTERNATIONAL 
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FEATURED JAVA SPECIFICATION REQUEST 


JSR 359: 

SIP SERVLET 2.0 


SIP Servlet API 

http i/Tsi pservlet-spec.lava.net 


JSR 116 (vl.O) 


O f 


JSR 359 (v2,0) 


JSR 289 (vl.1) 


JSR 359: SIP Servlet 2.0 is all 
about advancing the state of 
Session Initiation Protocol 
(SIP) in the Java ecosystem. 

SIP is a communications 
protocol for signaling and 
controlling communica¬ 
tions sessions over Internet 
Protocol (IP) networks. 

Oracle's Binod PG, the 
Specification Lead for JSR 
359, describes the objectives 
of this JSR as "improving the 
RFC coverage of SIP specification (SIP outbound), modernizing programming 
(using POJOs), better alignment with Java EE (CDI, latest servlet specification), 
SIP over websockets, extensibility, better concurrency control, updates to B2B, 
and so on." 

The JSR 359 Expert Group includes representatives from small and large 
companies, including AT&T, Ericsson AB, IBM, Cisco Systems, Thrupoint, 
OpenCloud, and TeleStax, all of which are seeking greater alignment between 
SIP servlets and Java EE. 

If you'd like to follow the progress, comment on the work, or contribute new 
suggestions, visit the JSR 359 project atjava.net. There, you'll find a variety of 
ways you can share your ideas and get involved. 
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YOUNG DEVELOPER 


Hania Guiagoussou 



Guiagoussou California high school student 
holds her prize Hania Guiagoussou became 
from the SITIC a passionate Java developer 
Innovation after attending a Java pro- 

gramming summer workshop 

at Oracle two years ago. "If 
it weren't for Alice, I wouldn't be interested 
in programming," she explains in an inter¬ 
view. Her most recent project, a "Water Saver 
System," tracks the need for watering using 
humidity and temperature sensors in the soil. 
She and herteammate won awards at a local 
engineering and science fair as well as third 
place in the Computer Science, Math, and 
Engineering category of awards sponsored by 
the local utility company. She also won third 
place and US$10,000 in the Digital Innovative 
Challenges, an African competition. 
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The Israel Java User Group (Java.IL) 
was formed in mid-2014 as a consoli¬ 
dation of three smaller user groups: HaBotzia 

(founded by Asaf Mesika and Uri Shamay), JJTV 
(founded by Ron Gross and Tomer Gabel), and 

ILJUG (founded by Haim Yadid). 

Gabel notes that "the idea for combining the 
groups originated with Haim, whose efforts to 
bootstrap ILJUG led to the realization that com¬ 
munity thrives on diversity, and that each of the 
three groups seems to operate in its own silo. 

In consolidating the groups we've increased our 
reach, and having a much larger organizational 
body enables more frequent, diverse events." 

The combined Java.IL group had 540 mem¬ 
bers at the start of 2015. "Our membership 
spans the entire Israeli software landscape, 


FEATURED JAVA USER GROUP 


which includes young startups, established 
enterprises, and consulting firms," Gabel says. 
Between 50 and 100 members participate in 
monthly meetups. 

Gabel describes Java.IL meetups as focusing 
largely on "production war stories...whether it's 
an alternative language (for example, Clojure), 
a brand-newin-memory database, oran amaz¬ 
ing REST documentation framework, we seek 
and promote speakers with hands-on produc¬ 
tion experience. Sharing the real caveats can 
save an immense amount of time for partici¬ 
pants who are often afraid to take the plunge 
without knowing the risks." 

The Java.IL leadership takes a genuine interest 
in career development amongthe community, 
including developing presentation skills. "While 


we strive to get as many people as possible from 
the Israeli community on stage to showcase 
their projects, not all of us are accomplished 
public speakers," says Gabel. "We therefore 
offer guidance and assistance in everything 
from initial preparation and abstract, through 
developing a slide deck, to providing an oppor¬ 
tunity for dry runs of the lecture." 

Although Java.IL is relatively new, the group 
is already involved in regional events. For 
example, Java.IL organized the back-end track 
atCodeMotion 2014 in Tel Aviv, receiving 30 
submissions from the community in just two 
weeks. "This makes us immensely proud of our 
membership," says Gabel. 

The group is also involved in the Adopt 
OpenJDK effort. Java.IL members Haim Yadid 
and Anat Levinger developed and contrib¬ 
uted mi prof (the "monadic JVM profiler"), a 
command-line tool for analyzing series of Java 
stack traces. 

Updates, recommended links, and other 
tidbits are regularly published on the 
Java.ILTwitter channel (@Java_IL). The first 
issue oftheJava.IL monthly newsletter has 
been published. 

And there's much more to come. According 
to Gabel, "We have high plans for 2015!" 


PHOTOGRAPHS BY I LAN SHACHAM/GETTY IMAGES AND TOMER GABEL 
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Virtual Technology 
Summit Global Series 


P Oracle Technology Network organizes 
the Virtual Technology Summit (VTS), 

an interactive online event that takes 
place four times a year. The present¬ 
ers are Java Champions and Oracle 
product experts with hands-on ex¬ 
perience. In the last summit, the Java 
track showed how to get the most out 
of Java wherever it is: from the largest 
servers to the tiniest devices. Technical 
sessions focused on how to improve 
the compatibility and security of large 
applications; howto work with Maven, 
HTML5, and JavaScript; howto get 
started with the Internet of Things 
(IoT) and wearables; and more. Learn 
how to take advantage of Java's depth 
and breadth. Check out the next event. 
VTS is also available on demand. 



JAVA CHAMPION PROFILE 

MARKUS EISELE 



Markus Eisele is a soft¬ 
ware architect, devel¬ 
oper, and consultant. He 
works daily with custom¬ 
ers on enterprise Java 
projects. Eisele became 
a Java Champion in 
March 2014. 

Java Magazine: W h e re 

did you grow up? 

Eisele: In Germany, 
between Hannover and 
Paderborn. 

Java Magazine: 

When did you first 
become interested in 
computers? 

Eisele: Early on, when 
my parents bought me 
a Schneider CPC464, 
running CP/M 2.0 and 


Basic 1.0, with a cas¬ 
sette recorder for stor¬ 
age. One of my first pro¬ 
grams was a patient and 
appointment registry for 
a physical therapy prac¬ 
tice. Thousands of lines 
of code and GOTOs! 

Java Magazine: What 
was your first profes¬ 
sional programming 
job? 

Eisele: I developed a 
vacuum cleaner bag 
finderon the internet. 
You entered your vac¬ 
uum brand and model, 
and it spit out the fitting 
bag type from the prod¬ 
uct line. 

Java Magazine: What do 
you enjoy for fun and 
relaxation? 

Eisele: I' m most relaxed 
and happy in front of a 
computer. But I enjoy 
social interaction as 
well. 

Java Magazine: What 
happens on yourtypical 
day off work? 

Eisele: Since I love my 
work, there's no real 


"day off." I catch up 
with news from the 
Java community and 
my JBoss peers daily. If 
the weather is fine, we 
try to make something 
exciting happen for the 
kids—theme parks or 
stuff like that. 

Java Magazine: What 
side effects of your 
career do you enjoy the 
most? 

Eisele: Traveling. I love 
experiencing countries, 
people, habits, and 
culture; talking about 
technology; and engag¬ 
ing with people. Twitter 
and blogs are fine, but 
traveling helps me 
understand the culture 
of countries, so I can 
work effectively with 
developers and custom¬ 
ers there. 

Java Magazine: Has 

being a Java Champion 
changed anything for 
you with respect to your 
daily life? 

Eisele: Yes, mostly from 
Oracle's handling of 


PHOTOGRAPHS COURTESY OF TDC 2014 BRAZIL/GLOBALCODE AND MARKUS EISELE 



the Oracle ACE and Java 
Champion community 
programs. They sup¬ 
port us by maintaining a 
strong brand and help¬ 
ing us make the right 
connections. 

Java Magazine: What a re 
you looking forward to 
in the coming years? 
Eisele: The second edi¬ 
tion of JavaLand, which 
I want to be even big¬ 
ger and betterthan 
the inaugural edition. I 
look forward to talking 
about the amazingopen 
source communities 
that make up Red Hat's 
success, meeting new 
peers, and seeing their 
unique solutions. Also, 

I want my girls to have a 
great start in their new 
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EVENTS 

Devoxx France APRIL 8-10 
PARIS, FRANCE 

Devoxx France brings together more than 
a thousand java enthusiasts. Thisyear's 
event has a bigger venue to accommo¬ 
date the growing number of attendees. 
The conference focuses on java, java 
Virtual Machine (jVM) languages, web 
technologies, mobile, cloud, and agile 
development. This year's theme, "Infinite 
Possibilities," is about java innovation 
and the java ecosystem. 


JAX 2015 

APRIL 20-24 
MAINZ, GERMANY 
This conference covers a range 
of current and future-oriented 
technologies, from java and 
Scala to Android and web tech¬ 
nologies and agile development 
models as well as DevOps. More 
than 200 international speakers 
present 230 talks and 20 day¬ 
long sessions. 

Great India n_Developer 

Summit (GIDS) 

APRIL 21-25 

BANGALORE AND HYDERABAD, 
INDIA 

Thirty thousand developers 


attend this five-day conference, 
held in Bangalore on April 21 
through 24 and in Hyderabad 
on April 25. The main GIDS 
activities in Bangalore focus on 
cloud, mobile, dynamic lan¬ 
guages, and analytics. The one- 
day GIDS. MINI conference in 
Hyderabad offers sessions that 
reflect the main conference. The 
dedicated java track focuses 
on dynamic languages such 
as Scala, Clojure, Groovy, and 
j Ruby, which run on the jVM. 

GeeCON 2015 

MAY 13-15 

KRAKOW, POLAND 

The three-day GeeCON 2015 


focuses on java and jVM-based 
technologies, dynamic lan¬ 
guages, enterprise architec¬ 
tures, patterns, distributed 
computing, software craftsman¬ 
ship, mobile, and much more. 

JEEConf2015 

MAY 22-23 
KIEV, UKRAINE 
jEEConf is the largest devel¬ 
oper conference in Ukraine. The 
annual conference focuses on 
java technologies for application 
development. 

Devoxx UK 

JUNE 17-19 
LONDON, ENGLAND 
In its third year, Devoxx UK is 
organized by the local java com¬ 
munity. Its focus is on java, web, 
mobile, and jVM languages. The 
conference is run by top devel- 
opertalent, community groups, 
and expert event specialists. 

javaOne Brazil 

JUNE 23-25 
SAO PAULO, BRAZIL 
javaOne Brazil is the learning 
opportunity for everything java 
in Latin America, from java 8 
and the new features of java 9 
to the Internet of Things, jVM 
languages, and more. Meet 
Oracle and community experts 
in a fun atmosphere. 


PHOTOGRAPH BY DENNIS JARVIS AT FLICKR 
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JAVA BOOKS 



JAVA 8 IN ACTION: LAMBDAS, 
STREAMS, AND FUNCTIONAL-STYLE 

PROGRAMMING 

By Raoul-Gabriel Urma, Mario Fusco, 
and Alan Mycroft 

Manning Publications, August 2014 
Java 8 in Action, a guide to the 
new features of Java 8, shows you 
how to write more-concise code 
in less time and automatically 
benefit from multicore archi¬ 
tectures. It introduces lambdas 
using real-world Java code. It 
covers the new Stream API and 
shows how you can use it to make 
collection-based code easierto 
understand and maintain. It also 
explains other major Java 8 fea¬ 
tures, including default methods, 
Optional, CompletableFuture, 
and the new Date and Time 
API. Written for programmers 
familiar with Java and basic 
object-oriented programming, 
it demonstrates howjava 8 is a 
game changer. 



JAVA EE 7 DEVELOPMENT WITH 

NETBEANS 8 

By David R. Heffelfinger 
Packt Publishing, January 2015 
Learn how you can use the fea¬ 
tures of NetBeans to acceler¬ 
ate yourdevelopment of Java EE 
applications. Coveringthe lat¬ 
est versions of Java EE APIs such 
asJavaServer Faces (JSF) 2.2, 
EnterpriseJavaBeans (EJB) 3.2, 

Java Persistence API (JPA) 2.1, 
Contexts and Dependency 
Injection 1.1, and JAX-RS 2.0 Java 
EE 7 Development with NetBeans 8 
walks you through application 
development usingJSF compo¬ 
nent libraries such as PrimeFaces, 
RichFaces, and ICEfaces. Find out 
how you can quickly and easily 
develop applications taking advan¬ 
tage of JPA, develop aesthetically 
pleasingJSF web applications, and 
develop SOAP-based and RESTful 
web services using NetBeans, 
including exposing EJB functional¬ 
ity as web services. 
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The Java Tutorial 

* CtuTf rr i hi Bis** 


Continuous 
Enterprise 
Development 
in Java 


THE JAVA TUTORIAL: A SHORT 

COURSE ON THE BASICS, 

6TH EDITION 

By Raymond Gallardo, Scott Hommel, 
Sowmya Kannan, Joni Gordon, and 
Sharon Biocca Zakhour 
Addison-Wesley Professional, 

December 2014 

Based on Java SE 8, this updated 
edition introduces the new fea¬ 
tures added to the platform, 
including lambda expressions, 
default methods, aggregate opera¬ 
tions, and more. An accessible and 
practical guide for programmers of 
any level, The Java Tutorial focuses 
on howto use the rich environ¬ 
ment Java provides to build appli¬ 
cations, applets, and components. 
Expanded coverage includes 
chapters on the Date and Time 
API, annotations, and the latest 
deployment best practices. Key 
security sections include "Security 
in Rich Internet Applications" 
and "Guidelines for Securing Rich 
Internet Applications." 


CONTINUOUS ENTERPRISE 

DEVELOPMENT IN JAVA 

By Andrew Lee Rubingerand 
Aslak Knutsen 
O'Reilly Media, March 2014 
Learn a use-case approach for 
developingjava enterprise appli¬ 
cations in a continuously test- 
driven fashion. With this hands-on 
guide, authors and JBoss project 
leaders Rubingerand Knutsen 
show you howto build high-level 
components, from persistent stor¬ 
age to the user interface, using the 
Arquillian testing platform and 
other JBoss projects and tools. 




































Reach More than 800,000 
Oracle Customers with 

Oracle Publishing Group 

Connect with the Audience that 
Matters Most to Your Business 


oracle: 


Oracle Magazine 

The Largest IT Publication in the World 

Circulation: 500,000 

Audience: IT Managers, DBAs, Programmers and Developers 


PROFIT ■ Profit 

Business Insight for Enterprise-Class Business Leaders to 
Help Them Build a Better Business Using Oracle Technology 

Circulation: 100,000 

Audience: Top Executives and Line of Business Managers 



Java Magazine 

The Essential Source on Java Technology, the Java Programming 
Language and Java-Based Applications 

Circulation: 200,000 and Growing Steady 
Audience: Corporate and Independent Java Developers, 
Programmers, and Architects 


Copyright © 2015, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of 
their respective owners. 
















loT Developer Challenge Winner 

Smart Greenhouse 

Control the light, watering, temperature, and humidity of 


your greenhouse—automatically. 

S mart Greenhouse, one of three 
professional category winners in 
the 2014 IoT Developer Challenge, 
is an Internet of Things (IoT) device 
and application that monitors and 
controls a greenhouse environment. 
The concept for Smart Greenhouse 
came into being after the core team— 
Dzmitry Yasevich, Pavel Vervenko, 
and Vladimir Redzhepov—attended 
javaOne Russia in April 2013. There, 

PHOTOGRAPH BY BOB ADLER/GETTY IMAGES 


BY KEVIN FARNHAM 

the team saw presentations of a smart 
house, various robots, and other 
devices, all controlled by Java. 

Yasevich notes, "We were impressed 
by these solutions and had an idea to 
do something like that. Pavel Vervenko 
suggested making an automated 
greenhouse. Everyone liked the idea!" 

First, the team selected the hard¬ 
ware. "We started to use Raspberry Pi 
as a basis," Yasevich says. "It is a com- 
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Smart Greenhouse IoT 
Developer Challenge 
winners Vladimir 
Redzhepov (left) and 
Dzmitry Yasevich 



pact but full-fledged com¬ 
puter with 700 MHz and 
memory at 512 MB. This 
system costs around $35." 

However, early on, a safety 
concern arose. "Current 
under high voltage passes in the 
greenhouse, and there is an automatic 
watering system, so it was necessary 
to properly consider all the aspects 
related to insulation," Yasevich says. 

As the software was being imple¬ 
mented, a hardware driver difficulty 
arose. "We had a problem with Pi4J, 
because it does not have the necessary 
API to work with the DHT11 humidity 
sensor," says Yasevich. To solve this, 
the team used JNI/C, writing a custom 


P Click here to watch time-lapse 
photography of the beans 
growing. 


FAST FACTS 

■ Smart Greenhouse 
automates green¬ 
houses, controlling light 
and watering, monitor¬ 
ing temperature and 
humidity, and providing 
live photography. 


driver for the sensor. 

More people became involved, 
including intelligent-hardware expert 
Vasili Slapik, and Vladimir Redzhepov, 
who was instrumental in making an 
open source Linux-based distribu¬ 
tion available for Smart Greenhouse. 
Yasevich says, "Any farmer can down¬ 
load this distribution, buy a Raspberry 
Pi and the necessary sensors, and 
get their own smart greenhouse into 
operation." Visit the Smart Greenhouse 
Drop_box, and download the source 


■ The hardware consists 
of a Raspberry Pi, sen¬ 
sors, an Arduino expan¬ 
sion board, power 
supply, SD card, cooler, 
and web camera. 

■ The backing software 
is composed of Oracle 
Java SE Embedded, 
JNI/C, Pi4J, servlets, 
and a Jetty container 
with an HTML/ 
JavaScript front end. 
Watch the video. 


code at BitBucket. </article> 
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Java Image Enabling SDKs that Help You Seethe Big Picture 


At first glance it may seem difficult but it's really quite simple. Atalasofts 
Joltlmage product is a proven SDK for image enabling your Java-based web 
applications, easily. Image enabling helps to add dimension to your data, so 
you can uncover insights such as correlations and causations hidden inside 
your 2-dimensional documents. Our SDK does the heavy lifting for you, 
saving time, money, and the headaches of figuring it out yourself. Backed by 
our highly knowledgeable & caffeinated support engineers, Joltlmage will 
enable your success and make the big picture so much easier to see. 




At a la soft 


Joltlmage 


Click for tips on viewing the stereogram 


Image enabling experts & bacon connoisseurs. Visit us online to see our full line of SDK products for .NET and Java 


Aialosofi 

w ww.ata lawfuom 






































PHOTOGRAPHY BY BOB ADLER/GETTY IMAGES 


ORACLE.COM/JAVAMAGAZINE /////// /////////////////////////////////// MARCH/APRIL 2015 


JCP Executive Series 

INTEL AND ORACLE: 

Performance, 
Innovation, 
and Success 

Intel’s Anil Kumar describes how 
collaboration drives a healthier 
Java ecosystem. BY STEVE MELOAN 




Java and 
Performance 


A nil Kumar, a performance architect in the 
Software and Services Group at Intel, plays 
an active role in the Java ecosystem. He 
contributes to several benchmarhs and standards 
organizations—as Java chair for the Open Service 
Gateway (OSGi) specification, and as Intel's rep¬ 
resentative on the Java Community Process (JCP) 
Executive Committee. He enables better user 
experience and resource use through customer 





































applications, and identifies optimization 
opportunities and default performance 
for hardware and software configura¬ 
tions. Kumar has presented at JavaOne, 
Soujava Brazil, and the International 
Conference on Performance Engineering 
(ICPE) on topics related to Java optimiza¬ 
tions, application latency, and system 
power consumption. 

Java Magazine: What are the coengi¬ 
neering processes that Intel and Oracle 
have engaged in around performance— 
within the realms of big data, cloud, 
Internet of Things [IoT], and computa¬ 
tionally intensive applications? 

Kumar: The coengineering work 
between Intel and Oracle is multifac¬ 
eted. These areas can be diverse, and 
they each have large ecosystems. As a 
result, we have multiple coengineering 
efforts, with teams working at varying 
levels. Some are focused at the compo¬ 
nent level, while others operate at the 
strategy level. But the core goal is to 
maximize performance and to seam¬ 
lessly enable these ecosystems. 

From Intel's perspective, performance 
means the complete end-user experi¬ 
ence—response time, certainly, but 
also ease of use, and interoperability 
between the components within these 
environments. The computationally 
intensive application space is mature 
but still growing. This includes every¬ 



thing from weather prediction to graph 
simulations, chemical and atomic 
reactions modeling, life sciences, and 
statistical process control. 

The cloud plays an important tech¬ 
nology role—enormous numbers of 
mobile devices require processing and 
storage in the cloud. So Intel and Oracle 
also collaborate there to better ensure 
security, reliability, and performance. 

And the IoT has seen explosive 
growth in the past few years. Many 
players—both hardware and software— 
occupy this space. But there is also a 
great deal of fragmentation, with differ¬ 
ent approaches, different types of soft¬ 
ware, and different hardware environ¬ 
ments. It will be important for Intel and 


Oracle to collaborate around the evo¬ 
lution of standards in this space, and 
to develop effective strategies around 
such collaborations. 

But the good news is that Intel and 
Oracle have a long history of fruitful 
collaborations. If we look at standard 
benchmark data—for example, specifi¬ 
cation JBB [Java Business Benchmark]— 
we see that hardware and software 
total performance increased 30 times 
from 2007 to 2013. And such collabora¬ 
tions have benefits far beyond simple 
performance in terms of the evolu¬ 
tion of Java standards. When we work 
together around standards, it creates a 
healthier ecosystem—more competi¬ 
tion, which drives innovation and which 


Left to right: 

Eric Kaczmarek, 
Pranita Maldikar, 
Anil Kumar, and 
Soji Denloye 
collaborate 
informally. 
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Mike Jones (left) 
and Anil Kumar do 
some hands-on work 
for the hardware 
side. The group 
identifies optimization 
opportunities and 
default performance 
for hardware 
and software 
configurations. 


results in better performance and ease 
of development and deployment. 

Java Magazine: You've been Intel's 
representative on the JCP Executive 
Committee since 2011. Can you discuss 
what motivated Intel to join the JCP and 
the benefits you see for current and 
future technology initiatives? 

Kumar: In 2002, when Intel first joined 
the JCP, a hardware platform was typi¬ 
cally tied to just one vertical stack. But 
Java enabled developers to deploy the 
same applications on different archi¬ 
tectural platforms. Intel joined the JCP 
to help contribute to the Java platform's 
growth, evolution, and innovation. It 


ORACLE.COM/JAVAMAGAZINE ////////////////////////////////////////// MARCH/APRIL 2015 


THE BENEFIT OF STANDARDS 

“When we work together around standards, it creates a 
healthier ecosystem—more competition, which drives 
innovation and which results in better performance.” 


was an opportunity for 
Intel to help empower our 
portion of the Java ecosys¬ 
tem, which was undergo¬ 
ing major changes at the 
time. And that process has 
been fruitful. If you look 
at the past 10 to 15 years, 
Java has played a powerful 
and influential role within 
a variety of technology 
ecosystems. Intel has par¬ 
ticipated in 14JSRs. 

Since 2011, it has been my opportunity 
to participate in defining Java standards 
within the JCP. With the IoT and other 
technology evolutions on the horizon, 
plenty of challenges and collaborative 
opportunities lie ahead. So I'm enjoy¬ 
ing the process and looking forward to 
the future. [More details of Intel's ear¬ 
lier participation on the JCP Executive 
Committee can be found here.] 

Java Magazine: What other benefits do 
you see in working with the JCP? 

Kumar: Intel's focus goes beyond sim¬ 
ply increasing compute power. We have 
become a platform company, and that 


means it's no longer just about the 
CPU. Many other evolving features and 
elements benefit the end user. Being 
a part of the JCP provides us a path for 
contributing to the evolution of the 
Java platform and leveraging our lat¬ 
est hardware features most effectively. 
But the resulting standards also benefit 
developers and end users. By enabling a 
single powerful implementation across 
multiple environments, the entire eco¬ 
system benefits. 

Java Magazine: At JavaOne 2014, Intel 
announced it was joining the OpenJDK 
community as a contributing member. 
What are the motivations and benefits 
for this role within OpenJDK? 

Kumar: OpenJDK provides an environ¬ 
ment for innovation at a rapid pace. 
And many of the most important 
technology players are already part 
of the OpenJDK community. That has 
created a larger and healthier ecosys¬ 
tem for competition and collabora¬ 
tion. Intel's new features—including 
security features, networking, and 
storage—are rapidly evolving. The JCP 
standards process understandably 




co 


i- 

=) 

o 

DO 

< 



blog 
























takes time, because it requires a more 
thorough scrutiny. But at the same 
time, OpenJDK provides a fast path to 
enable these hardware features in a 
robust manner. 

Java Magazine: Math libraries and 
analytic solutions are important within 
the big data space. Is this also part 
of Intel's work within the OpenJDK 
community? 

Kumar: From Intel's perspective, there 
are two basic types of performance- 
related innovations. At the hardware 
layer, we focus on accelerators in 
functional areas. But it's also possible 
to fine-tune the software layer of the 
stack, which can exploit either exis¬ 
tent or new hardware features more 
efficiently. Because our goal is always 
better performance for the end user, 
Intel strives to contribute to both the 
software and hardware stack evolu¬ 
tions within OpenJDK. 

Java Magazine: At JavaOne 2014, 
Michael Greene [Intel vice president, 


Software and Services Group] discussed 
Intel's work with the Hadoop commu¬ 
nity to optimize hardware related to big 
data solutions. What is Intel's vision 
of empowering the flow from data- 
nibbling sensors, to number-crunching 
servers, to end-user services and 
insights? What are the implications for 
performance demands? 

Kumar: As part of our focus on big 
data-related libraries, Intel is working 
with the Hadoop community and the 
OpenJDK community to ensure that 
stacks achieve maximum performance. 
Garbage collection is one major area of 
interest. And for future deployments, 
we are working on accelerators for 
encryption, compression, and com¬ 
putational analytics. We've recently 
published several white papers and 
videos that explore how end users can 
benefit on these fronts within the big 
data space. (See the "Working with 
Hadoop" sidebar.) 

At the CES 2015 keynote, our CEO, 


BUILDING OPENNESS AND COMPETITION 

“Any technology environment is made more robust by 

greater openness and competition. Openness serves to 
speed evolution and innovation.” 
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Brian Krzanich, announced several 
breakthrough technologies for con¬ 
sumer electronics, including innova¬ 
tions within the IoT space. There are 
many evolutions occurring in the IoT, 
including our own internal performance 
streamlining. I am involved in this pro¬ 
cess with data processing and analyt¬ 
ics at the back end. The pain points for 
end users are enormous data transfers, 
large memory footprints, and secu¬ 
rity issues. Therefore we are working 


Anil Kumar (left) 
and Liqi Yi catch up. 
Intel works with the 
Hadoop and OpenJDK 
communities to 
ensure that stacks 
achieve maximum 
performance. 






















Working with Hadoop 

Intel has a dedicated team work¬ 
ing to improve the performance 
and the end-user experience for big 
data users. Readers can find many 
basic to expert-level optimizations 
for Hadoop in these Hadoop-related 
white papers, talks, and videos: 
“Optimizing Java and Apache 

Hadoop for Intel Architecture” 
“Accelerate Big Data Ana lysis with 

Intel Technologies” 

“Tuning Java Garbage Collection 

for HBase” 

Video interviews with Eric 
Kaczmarek, big data and Java 
performance architect, and Soji 
Denloye, Java performance engineer: 
Part 1, Part 2, Part 3, and Part 4. 



to improve our deployments on these 
fronts, and to maximize our solutions 
in future hardware designs. 

Java Magazine: What are the challenges 
of preserving standards while also pro¬ 
moting future innovation? 
How does that affect Intel 
and its initiatives? 

Kumar: The existent 
standards have worked 
very well up to now. But 
discussion and debate 
within the community 
have been increasing 
around IoT. This area is 
rapidly evolving, and now 
is the time for a major 
focus on the balance 
between innovation and 
standards. Even if we 
begin considering these 
issues now, it will be a 
one- or two-year time 
frame for the processes 
to play out. So we hope 
more companies will take 
a role in developing new 
standards around the 
IoT space. 

Java Magazine: What recent JSRs has 
Intel been involved with, and what were 
the issues for your industry initiatives? 
Kumar: We've been contributing to sev¬ 
eral JSRs. One was focused on greater 
openness within theJCP. From Intel's 


perspective, any technology environ¬ 
ment is made more robust by greater 
openness and competition. Openness 
serves to speed evolution and innova¬ 
tion. And of course, performance is also 
a key objective. So there are relevant 
JSRs in the performance realm—for 
example, lambda expressions in Java 
(JSR 335), aiding computationally inten¬ 
sive tasks, enabling multiple cores (JSR 
166 fork/join framework), and making 
the overall programming paradigm eas¬ 
ier. As more data is collected through 
sensors, telemetry, health devices, and 
so on, companies need to analyze it in 
multiple dimensions, which requires 
tremendous compute power. The way 
that data is vectorized can also produce 
manyfold benefits. Intel is exploring 
a JSR focused on autovectorization of 
data—where a developer codes in the 
traditional fashion, but at the JIT [just- 
in-time] level we would autovectorize. 
AtalkatJavaOne 2014, "Optimizing the 
Future of Java Thro u gh Collaboration." 

highlighted advanced features such as 
Intel AVX/AVX2 SIMD, upcoming Intel 
TSX, and java.util.zip CRC32 accelera¬ 
tion, which is another example of end 
users benefiting from Intel and Oracle 
collaboration. 

Java Magazine: Has Oracle delivered on 
the promise of increased transparency 
and openness within the JCP? 

Kumar: From my experience as an 


Executive Committee member, I would 
say very much so. Oracle and the pro¬ 
gram management office delivered 
the transparency JSR, which has had 
a significant impact. And I've seen 
the JCP program office working with 
the Specification Leads to ensure that 
JSRs adhere to the new transparency 
requirements. That demonstrates a 
concerted effort toward openness, not 
just in following the letter of the law. 
Java Magazine: What structural/ 
procedural changes could you imagine 
further improving or strengthening 
theJCP? 

Kumar: Two JSRs (JSRs 348 and 358) 
are related to this question. One—to 
simplify the membership and contribu¬ 
tion process—is focused on individual 
members. The other JSR addresses 
recent JSPA [Java Specification 
Participation Agreement] discussions 
focused on how to better manage IP 
[intellectual property] flow policy. We 
are seeing innovations in the IoT space, 
as well as in the traditional back-end 
data space. But the current IP flow is 
complex. If we can streamline that 
process, it will better facilitate collabo¬ 
ration and participation among both 
current and newJCP members. 

Java Magazine: How can the JCP better 
serve the Java community? 

Kumar: Several JSRs pertaining to the 
cloud and big data have been func- 
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Anil Kumar (left) 
and Soji Denloye 
discuss issues on 
the fly. Increasingly, 
debate in the Java 
community is about 
the Internet of 
Things (loT) and how 
to balance innovation 
and standards. 


tioning well—there's one on concur¬ 
rency (JSR166), and others are related 
to annotation (JSRs 269 and 308) and 
management (JSRs 73, 77, 88, 352, and 
so on), which enable easier deploy¬ 
ment. But similar approaches and JSRs 
in the IoT space are still somewhat 
lacking. The JCP serves primarily as a 
governance and approval body, but an 
active push toward facilitating new JSRs 
around IoT would both enhance the 
perception of the JCP and better serve 
the Java community. 

Java Magazine: What would you like 
Java developers to understand about 
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the processes of the JCP? 
Kumar: Before I joined the 
Executive Committee, my 
impression as a Java devel¬ 
oper was that corporate 
representatives played the 
primary roles within the JCP 
and other such governing 
bodies. But having now been 
a part of the JCP for the past 
three years, I've discovered 
that individual contributors 
and Java developers can also 
play important roles. There 
were mechanisms in place 
for such contributions, and 
the JCP has worked hard 
to establish even greater 
openness and transparency. 

It would be wonderful to 
provide examples of past and current 
roles that Java developers have played 
within the JCP—to explore the contri¬ 
butions that have been made, and how 
that benefited the entire Java commu¬ 
nity. It would serve to inspire others to 
become involved, and to demonstrate 
the real-world possibilities of contribut¬ 
ing within the JCP. 

Java Magazine: Any parting thoughts? 
Kumar: When I look back over the past 
10 years, I think few could have imag¬ 
ined the innovations in the hardware 
space—mobile computing, the cloud, 
big data, and IoT. Looking at platform- 


level innovations that have been deliv¬ 
ered by Intel and Oracle, I couldn't have 
imagined we would each be playing 
such critical roles in these areas. It has 
been a fascinating decade to witness. 
When I consider the next five years, 
while it will be different on many levels, 
I am certain that Oracle and Intel will 
continue to play major roles. I want to 
see those roles continue along a col¬ 
laborative path, through a standards 
body such as the JCP—one that is open, 
agile, and proactive. Such collabora¬ 
tion breeds competition and innova¬ 
tion, which ultimately benefits the 
entire technology ecosystem and the 
end user. </articie> 


MORE ON TOPIC: 




Java and 
Performance 


Steve Meloan is a former C/UNIX soft¬ 
ware developer who has covered the web 
and the internet for such publications as 
Wired, Rolling Stone, Playboy, SF Weekly, 
and the Son Francisco Examiner. He 
recently published a science-adventure 
novel, The Shroud, and regularly contrib¬ 
utes to The Fluffington Post. 
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From left to right: Rob Fletcher, Clay 
McCoy, Andy Glover, Adam Jordens, 
Brian Johnson, and Shelley Bower are 
part of the always-on Netflix team. 


are even ditching their traditional satellite 
and cable services in favor of an "all-you- 
can-eat" TV service. For about USS10 per 
month, Netflix members can watch as many 
movies and shows as they want—anytime, 
anywhere, on nearly any internet-connected 


Netflix powers through 2 billion 
content requests per day with 
Java-driven architecture. 


P utting members in control and serv¬ 
ing up targeted content based on 
their preferences has made Netflix the 
world's leading internet television network. 
Just about everyone is familiar with its flex¬ 
ible, on-demand model. Some consumers 
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NETFLIX 

netf1ix.com 


Headquarters: 

Los Gatos, California 

Revenue: 

USS4.4 billion 

I 

Employees: 

2,300 

Java technologies 
used: 

Java EE 6, Java EE 7, 
and Java EE 8 



screen. They can play, pause, and 
resume watching, all without commer¬ 
cials or commitments. 

Java's innate scalability has played a 
key role in enabling Netflix to expand 
its service to more than 57 million 
members in 50 countries. Collectively, 
these viewers enjoy more than 1 billion 
hours of shows and movies per month, 
including original series, documenta¬ 
ries, and feature films. 
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Andy Glover and Jon Schneider 
(left) check processes, while 
Glover and Adam Jordens (top) 
confer. “The vast majority of 
the services running within our 
architecture are built on Java 
and the Java Virtual Machine,” 
Glover says. 


"The vast majority of the services 
running within our architecture are built 
on Java and the Java Virtual Machine 
[JVM]," says Andrew Glover, delivery 
engineering manager at Netflix. "Netflix 
uses a stateless architecture, so as we 
bring in more customers we are able 
to bring up more instances relatively 
easily. We have thousands of Java pro¬ 
cesses running all the time, yet as we 
grow we don't have huge infrastructure 


challenges. We also have a lot of open 
source tools that are Java-based, which 
makes it easy to monitor, upgrade, and 
scale our services." 

Glover has a distinguished history of 
executive leadership, yet his passion 
for writing code hasn't wavered. Before 
joining Netflix, he held senior positions 
at Vanward Technologies, Stelligent, 
and App47. Today, he manages a team 
of seven engineers who are responsible 































“We have thousands 
of Java processes 
running all the time, 
yet as we grow we 
don’t have huge 
infrastructure 
challenges,” 

Glover says. 


for the continuous delivery and deploy¬ 
ment of new software within the vast, 
worldwide Netflix infrastructure. "Rapid 
turnaround is an important differentia¬ 
tor for IT, and it is certainly a competitive 
advantage for Netflix," he emphasizes. 
"Thanks to our continuous delivery 
platform, if a business unit wants a new 
feature pushed into production, we can 
do it within a couple hours." 

Glover is echoing one of the core 
mantras at Netflix: move fast, make 
decisions fast, and do it all on a solid 
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FACILITATING INNOVATION 

“We need to make decisions fast, 

based on the data we collect.” 

—Andy Glover 


architecture. "Because we are a data 
company we need to be able to facili¬ 
tate rapid innovation across the globe," 
he adds. "We need to make decisions 
fast, based on the data we collect, so 
we can provide the viewing experiences 
that our customers want—and stay 
ahead of competitors." 


HOW NETFLIX WORKS 

Netflix members love being able to 
browse, select, and play movies with a 
few clicks of a remote control or mouse, 


but not many of them stop to think 
about the sophisticated hardware and 
software infrastructure that delivers the 
content. Whenever you launch Netflix 
toqueueup Ca.ddysha.ck, Some Like it 
Hot, Breaking Bad, or any other movie 
or television show, Netflix first executes 
some core Java services to determine 
who you are, where you live, if you paid 
your bill, and what display device you 
are using. Then its content delivery net¬ 
work determines the closest cache from 
which to stream your content, buffered 
to the optimum audio and video quality 
for your connection at that moment. 

"When a customer launches Netflix, 
behind the scenes the system kicks 
off about a dozen different processes 
to authorize that person, figure out 
what device he or she is using, make 
sure the account is current, and look at 
their recent activity," explains Glover. 
"That kind of orchestration is done 
with Eureka, an open source tool based 
on Java." 




























Netflix Fast Facts 

Year founded: 1997 

Total number of subscribers 

57.4 million 

Number of hours per month 
that people spend watching 
Netflix: More than 1 billion 


Netflix stores master copies of digi¬ 
tal films in Amazon Web Services and 
Amazon Simple Storage Service clouds. 
Each film is encoded into more than 50 
versions based on video resolution and 
audio quality—amounting to more than 
a petabyte of data. From there it is dis¬ 
tributed to content delivery networks 
such as Akamai. Limelight, and Level 3. 
which in turn feed it to local internet 
service providers, and on to viewers. 

The infrastructure is based 
on a service-oriented archi¬ 
tecture (SOA) that handles 
approximately 2 billion content 
requests per day. "We have 
thousands of microservices 
running all the time," explains 
Glover. "This configuration 
offers lots of independence to 
our developers as they deploy 
new software." 


Percentage of viewers 
who binge-watch shows: 

61 percent 

Bandwidth used: During 
peak times, Netflix accounts 
for 33 percent of North 
America’s downstream 
internet traffic. 



CONTINUOUS DELIVERY 

Netflix engineers constantly 
deliver new features to the 
company's worldwide user 
base via continuous delivery. 

It's the cornerstone of all the 
company's business initiatives, 
with engineers using continu¬ 
ous delivery processes to rapidly 
push enhancements and bug 
fixes to customers, at low risk and with 
minimal overhead. According to Glover, 
this automated pipeline enables engi¬ 
neering teams to move concise feature 
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Shelley Bower, Brian Johnson, and Mike McGarr strategize. Netflix uses its continuous 
delivery system to schedule deployments based on up-to-the-second traffic statistics. 


sets from development and testing into 
a stable production environment. "The 
Netflix software environment changes 
a little bit every day," he adds. "The 
continuous delivery environment is the 
conduit for speed and innovation. We 
don't have the luxury of downtime. We 
can't tell customers to come back in 
three hours after we have completed 
a deployment." 

Performance problems can cause 
buffering errors for viewers, so Netflix 


uses its continuous delivery system to 
schedule each deployment based on 
up-to-the-second traffic statistics. For 
example, if Europe is in its peak view¬ 
ing time, the delivery team might elect 
to delay a deployment until off-peak 
hours. "One of the core metrics that 
guides the business is the number of 
streaming starts per second," contin¬ 
ues Glover. "We watch this closely in 
every region. If there is a dip, we are 
alerted immediately." 
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1 PETABYTE DAILY 

“The bread and butter 

of Netflix involves 
using data to 
recommend shows 
that customers 
might like.” 

—Andy Glover 



FOLLOW THE DATA 

Streaming video content over the inter¬ 
net isn't difficult. Technically, there is 
a low barrier to entry. What differenti¬ 
ates Netflix among video-on-demand 
services is all the data it collects about 
what customers are watching, how 
often they use the service, what choices 
they make before and after each show 
is completed, which genres they pre¬ 
fer, and many other interrelated data 
points. Java programs capture about a 
petabyte of data each day for the rec¬ 
ommendation engine, which suggests 


other shows and movies that it thinks 
viewers will like. 

Netflix also uses this data to deter¬ 
mine what content it should buy versus 
produce—as well as just how to produce 
original content to maximize viewer- 
ship. For example, when Netflix pur¬ 
chased the licensing rights to House of 
Cards, it was comfortable with the big 
investment because David Fincher, the 
showrunner, and Kevin Spacey, its star, 
were very popular on Netflix. 

"The bread and butter of Netflix 
involves using data to recommend 


Chris Berry (foreground) 
and Andy Glover consider 
options. Netflix uses its data 
to determine what content it 
should buy or produce. 


shows that customers might like," 
Glover says. "Everyone at Netflix plays 
a role in either capturing that data or 
making use of that data. Our business 
units can make changes quickly, partly 
because the entire infrastructure is built 
on Java." 

GROUNDED IN JAVA 

Netflix Open Connect includes Java pro¬ 
cesses that are installed in third-party 
data centers to govern how content 
is exchanged, cached, and streamed. 
According to Glover, this SOA includes 
thousands of instances that communi¬ 
cate with one another to execute busi¬ 
ness processes. "As Sun [Microsystems] 
used to say, The network is the com¬ 
puter.' It's all commodity hardware 
running Linux Ubuntu and Java EE 7. 

If a node fails, we can bring up a new 
one easily." 

While the infrastructure runs on 
Linux computers, video encoding is 
often done on Windows machines, 
facilitated by a Java-based framework. 
Java's easy portability among plat¬ 
forms permits flexibility, now and in 
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From left to right: 

Cameron Fieber, Andy 
Glover, and Zan Thrash 
analyze results. Netflix 
uses commodity hardware 
running Linux Ubuntu. 


the future. It also makes it easy to hire 
skilled engineers. "When it comes to 
recruiting, it is easy to find Java devel¬ 
opers," Glover notes. "We can always 
find top talent because we are not 
chasing some brand-new technology or 
language that only hipsters know." 

Glover believes Java 8 has revolution¬ 
ized Java, calling it "the most significant 
release of Java in the last 10 years." He 
especially likes default methods, lamb¬ 


das, and the java.util.stream API, which 
enables functional-style operations on 
streams of elements, such as Hadoop 
MapReduce transformations on 
collections. "Functional-style program¬ 
ming as a core language feature is a 
powerful construct that will have a 
big impact on the developmental 
landscape," he predicts. "Java 8 dra¬ 
matically changes the Java language 
for the better." 



Netflix uses Spring Boot as the basis 
of its SOA because it offers the scalabil¬ 
ity and maturity of the JVM. "Netflix is a 
giant SOA," Glover says. "Java platform 
services make it possible for a devel¬ 
oper to quickly come up to speed and 
write a service that works in our archi¬ 
tecture. Our operational footprint on 
the JVM is tried and true. Spring Boot 
is a good example of a web framework 
that you can get up and running in 
short order. There is still a tremendous 
amount of innovation within Java and 
in the Java ecosystem. It's solid. It's not 
going anywhere. That's why there is so 
much infrastructure being built on Java, 
such as Hadoop." 

Java provides both the flexibility and 
scalability Netflix needs. "Our technol¬ 
ogy infrastructure is not a house of 
cards," he concludes. "In order to con¬ 
tinue to be the number one streaming 
video content provider on the internet, 
we have to continually innovate. Java is 
the bedrock of our software architec¬ 
ture. The core operational infrastructure 
of Netflix, including the monitoring 
tools and platform services, has all 
been built in Java." </articie> 
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Based in Santa Barbara, California, 
David Baum writes about innovative 
businesses and emerging technologies. 



co 


i- 

=) 

o 

DO 

< 



blog 




ORACLE.COM/JAVAMAGAZINE ////////////////////////////////////////// MARCH/APRIL 2015 

































//java architect / 



KIRK PEPPERDINE 


Improving the Observability of 
Fork/Join Operations 

How to determine when adding parallelism might help performance 





Java and 
Performance 
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T he introduction of lamb¬ 
das and streams in Java 
SE 8 allows Java developers 
to easily add parallelism to 
their applications. But will 
making an application more 
parallel actually improve its 
performance? This is the sort 
of question that we at jClanty 
face on a regular basis. 

To answer that ques¬ 
tion, let's take a look at how 
ForkJoinPool, the common 
thread pool introduced in Java 
SE 7, is configured and how 
it is being utilized. Let's also 
take a look at howto instru¬ 
ment this framework so that 
we can get metrics that will 
tell us when and how to con¬ 
figure it. 

In the March 2005 edi¬ 
tion of Dr. Dobb's Journal, 

Herb Sutter uttered the now 
infamous phrase, "the free 
lunch is over." That was fol¬ 
lowed by "Concurrency is 


the next major revolution 
in how we write software," 
which was followed by lots of 
talk about Amdahl, Moore's 
law, heat dissipation, how 
programmers and program¬ 
ming languages needed to 
change, and so on. There was 
also a lot of talk about read- 
write barriers and lock-free, 
wait-free algorithms as well 
as the use of techniques that 
rely on "divide and conquer" 
techniques. One of the divide 
and conquertechniquesthat 
has gained a lot of favor is the 
fork/join framework. 

The Fork/Join Framework 

The fork/join framework 
allows you to divide a large 
task into a number of smaller 
subtasks. Each of these 
smaller tasks is broken down 
even further until, finally, you 
have atomic units of work 
that will all be executed. 


The results of processing 
the atomic units are then 
joined together to produce a 
final result. 

Java SE 7 introduced 
top-level support for the 
fork/join framework with 
ForkJoinPool, ForkJoinTask, 
and ForkJoinWorkerThread. 
Unfortunately, the program¬ 
ming idiom for using the 
fork/join framework is fairly 
complex and, consequently, 


this technique has been 
mostly ignored by everyday 
Java developers banging out 
business applications. 

That all changed with the 
release of Java SE 8 and the 
introduction of streams. 
Underthe hood, parallel¬ 
ized streams are converted 
into fork/join tasks, and that 
makes it easy for developers 
to add parallelism to their 
applications without under- 




Click here to listen to Stephen Chin (right) 
interview Kirk Pepperdine at Jfokus. 
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standing the underlying mecha¬ 
nism. As much as adding paral¬ 
lelism might help performance, 
indiscriminate use of the fork/join 
framework can certainly harm per¬ 
formance. Let's explore this further 
by looking at how streams work. 

Streams 

A stream takes a collection as its 
source and (similarly to an itera¬ 
tor) makes each element available 
to some process. Streams have no 
storage and might or might not 
be bounded. In addition, streams 
support intermediate (aka lazy) and 
terminal (aka eager) operations. 
Intermediate operations always 
return a new stream, whereas ter¬ 
minal operations produce a result. 
All of this happens serially unless 
a parallel stream is asked for, and 
that option is supported by the 
underlying datasource. Listing 1 
shows examples of sequential and 
parallel streams. 

Note: For hyper-threaded CPUs, 
there are two logical cores for every 
physical core. 

In Listing 1, stream() executes 
sequentially over the data¬ 
source, which is an array of gar¬ 
bage collection (GC) log records 
called gcLogRecords. Calling 
parallelStreamQ on the data¬ 
source results in the work being 
executed as a fork/join operation. 
By default, the fork/join frame¬ 


work uses a common thread pool, 
and the default size of that thread 
pool is equal to the number of 
logical cores. 

Response Time, Threads, 
and Queues 

The response time (or latency) of 
a request is the composite of the 
time when the thread is making 
forward progress and the time 
when processing is stalled (known 
as dead time). Typically, dead time 
accounts for a significant portion 
of the response time and, typi¬ 
cally, most of that time is spent in 
a queue. Thus, if we can reduce the 
time spent in queues, we should 
be able to see significant reduction 
in latencies. 

In the case of a fork/join opera¬ 
tion, what we'd like to know is how 
many threads are in the common 
thread pool and how many out¬ 
standing requests are in the work 
queue. It is this information- 
combined with other resource 
utilization metrics—that will help 
us determine whether our sys¬ 
tem could benefit from a larger or 
smaller common thread pool. 

It's easier to talk about queues if 
we first talk a bit about queueing 
theory and, in particular, Little's law. 
Little's law states that the number 
of tasks in a system (L) is equal to 
the average amount of time a task 
spends in the system (lambda) 
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LISTING 1 


public DoubleSummaryStatistics calculateSequentially( 
List<String> gcLogRecords) throws IOException { 
return gcLogRecords.streamQ. 

map(applicationStoppedTimePattern::matcher). 
filter(Matcher::find). 
mapToDouble(matcher -> 
Double.parseDouble(matcher.group(2))). 
summaryStatistics(); 


public DoubleSummaryStatistics calculateParallel( 
List<String> gcLogRecords) throws IOException { 
return gcLogRecords.parallelStream(). 

map(applicationStoppedTimePattern::matcher). 
filter(Matcher::find). 
mapToDouble(matcher -> 
Double.parseDouble(matcher.group(2))). 
summaryStatistics(); 


^3 Download all listings in this issue as text 


multiplied by the average rate of 
entry into the system (N). That is 
expressed as L = lambda * N. 

The powerful thing about Little's 
law is that it allows you to treat the 
entire system as a black box. For 
example, I worked on a system that 
had a requirement for 1,500,000 
transactions per hour. The average 
latency was 300 milliseconds. 

1,500,000 transactions per hour 
equals 417 transactions per sec¬ 
ond, so if we peek into the system, 
the expected number of active 
requests at any point in time would 


be 0.300 * 417 = 125. If our thread 
pool size was limited to 8, that 
would result in 8 requests making 
forward progress and 117 requests 
sitting in a work queue. Assuming 
there is enough hardware to sup¬ 
port more active requests, it's very 
likely that this system would ben¬ 
efit from a larger thread pool. 

Instrumenting Fork/Join 
Operations 

To get the statistics needed for the 
type of analysis performed above, 
we need to add some instrumenta- 
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tion. Aftercareful examination, I 
determined that I needed to instru¬ 
ment only ForkJoinTask. However, 
the monitoring would be attached 

to ForkJoinPool. 

Let's define some test cases that 
will provide a context for the rest 
of this discussion. At jClarity, we've 
spent a lot of time looking at GC 
logs with Censum. I took this as an 
opportunity to look at how lamb¬ 
das might be used to parse a GC 
log. In the following example, we'll 
focus on records for "application 
stopped time" and "application 
concurrent. " Listing 2 shows a rule 
for parsing application stopped 
times from a GC log. 

The code in Listing 2 takes a list 
of GC log entries, filters them by 
those that match the pattern, and 
then calculates a number of sta¬ 
tistics that describe the stopped- 
time behavior. We can execute this 
code by submitting it directly to the 
ForkJoinPool, as shown in Listing 3, 
which shows the explicit execution 
of a fork/join operation. 

The code in Listing 3 creates a 
new ForkJoinTask and submits it to 
the common ForkJoinPool. When 
the task has been completed, a call 
to ForkJoinTask::setCompletion() is 
made, after which the results can 
be retrieved. The parameter to 
that call is a status word that lets 
the caller know whether the task 
completed normally or whether 


an error was trapped. Any call to 
get() prior to setCompletion() being 
called will result in the calling 
thread being blocked. 

Ideally we would like to mea¬ 
sure the time from the first call to 
submit!) to the time when the job 
is complete. This is easily done 
when we explicitly call the fork/ 
join operation. However, this 
technique fails with the code in 
Listing 4 even though the expres¬ 
sion in that listing is transformed 
into something similar to the code 
shown in Listing 3. 

To work around this, the moni¬ 
toring can be placed deeper into 
the call stack. Doing this intro¬ 
duces a small amount of error in 
the timing, but it is well below 
the level of noise that you'll find 
in any application. The code in 
Listing 5 was lifted from OpenJDK 
and instrumented with a call to 
emit a signal when the task starts 
and when the task is complete. 

Defining the 

ForkJoinPoolMonitorMXBean 

Now that we have a signal to work 
with, we need something to ana¬ 
lyze it. My favored technique in Java 
is to introduce an MXBean. A com¬ 
plete discussion about MXBeans 
is beyond the scope of this article. 
However, it's a technique that I 
teach in my performance tun¬ 
ing workshop, and it is covered 
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LISTING 3 / LISTING 4 / LISTING 5 / LISTING 6 / LISTING 7 


public class ApplicationStoppedTimeStatistics { 

public DoubleSummaryStatistics calculateParallelStream( 
List<String> logEntries) throws IOException { 
return logEntries.parallelStream(). 

map(applicationStoppedTimePattern::matcher). 
filter(Matcher::find). 
mapToDouble(matcher -> 
Double.parseDouble(matcher.group(2))). 
summaryStatistics(); 

} 

} 


Download all listings in this issue as text 


in an online tutorial on Oracle 
Technology Network. Let's define 
some requirements and then intro¬ 
duce an interface for the MXBean. 

The ForkJoinPoolMonitorMXBean 
should expose the following: arrival 
and retirement rates, the total 
number of tasks submitted, and 
the average time a task spends in 
the system. If you recall, Little's 
law tells us that the average time 
of a task in the system multiplied 
by the arrival rate equals the aver¬ 
age number of tasks in the system. 
The average queue length com¬ 
bined with the number of worker 
threads and a set of timers can 


be used to calculate the average 
amount of dead time accumulated 
by tasks waiting in the queue. This 
could lead us to adjust the size of 
the thread pool, assuming there is 
enough hardware to support the 
reconfiguration. The definition of 
the ForkJoinPoolMonitorMXBean is 
shown in Listing 6. 

In addition to these methods, 
ForkJoinPoolMonitor implements 
submitTask(ForkJoinTask) and 
retireTask(ForkJoinTask). The imple¬ 
mentation (shown in Listing 7) 
makes use of a ConcurrentHashMap 
to track all tasks currently in 
the system. 
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Figure 1 


There is also code to wire the 
MXBean into the PlatformMBean 

Server. This exposes the MXBean 
to Java Management Extensions 
(JMX) clients such as jConsole or 
VisualVM's JMX plugin. 

The Unit of Work 

The call to concurrentParallelStream() 

in Listing 8 results in the 
repeated execution of two 
Runnable lambdas. The submit() 

method takes the Runnable and 
wraps it in a ForkJoinTask. Inside 
the calculateParallel() is the call 
sequence Files.lines(path) 

.parallel!). This will result in a new 
ForkJoinTask that will, in parallel, 
process all the lines in the GC log 
file. The two tasks are submit¬ 
ted one after another and before 
the call to get(). That means they 
should run in parallel. However, 
the underlying lambda expressions 


in the calls to calculateParallel() will 
disrupt this because they intrinsi¬ 
cally make a call to get(). So while 
there will be some overlap in the 
execution of the calls to the fork/ 
join operation, as the numbers will 
show, the overlap is limited. 

Getting It to Run 

To get all this to work, we need to 
execute our modified versions of 
ForkJoinPool and ForkJoinTask that 
we borrowed from OpenJDK. In 
OpenJDK, the original implemen¬ 
tations are found in rt.jar and that 
file is loaded by the bootstrap class 
loader. Fortunately we don't have to 
tear rt.jar apart and rebuild it with 
our versions of the classes. The Java 
Virtual Machine (JVM) gives us the 
option to prepend classes to the 
boot classpath using the parameter 
-Xbootclasspath/p:<path>, as shown 
in Listing9. 
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LISTING 8 LISTING 9 


public longconcurrentParallelStream(int repeat, 

StringfileName) { 

ForkJoinTask<DoubleSummaryStatistics> applicationTime; 
ForkJoinTask<DoubleSummaryStatistics> 
applicationStoppedTime; 

DoubleSummaryStatistics 
applicationStoppedTimeStatistics = null; 
DoubleSummaryStatistics applicationTimeStatistics = null; 
long timer = System.nanoTimef); 
try { 

Path path = new File(fileName).toPath(); 
for (int i = 0; i < repeat; i++) { 
applicationTime = 

ForkJoinPool.commonPool().submit(() 

-> new ApplicationTimeStatistics() 
.calculateParallel(path)); 
applicationStoppedTime = 
ForkJoinPool.commonPool().submit(() 

-> new ApplicationStoppedTimeStatistics() 
.calculateParallel(path)); 

applicationTimeStatistics = applicationTime.get(); 
applicationStoppedTimeStatistics = 
applicationStoppedTime.getQ; 

} 

timer = System.nanoTimef) - timer; 
System.out.println("\n-Concurrent Parallel—"); 
System.out.println("Concurrent Stats :" 

+ applicationTimeStatistics); 
System.out.println("Stopped Stats :" 

+ applicationStoppedTimeStatistics); 
System.out.println("Combined Time (client) :" 

+ ((double) timer) / lOOOOOO.Od + " ms"); 

} catch (Throwable t) { 
t.printStackTrace(); 

} 

return timer; 

} 


Download all listings in this issue as text 
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Note: Class loading in Java SE 9— 
and, thus, these details—will most 
likely change. 

Discussion of Results 

The two units of work used in this 
example are both big and complex 
enough that they should benefit 
from parallelism. The first unit 
of work reads all of its data from 
memory. The second reads its 
data from disk. Each of the units 
of work is run serially using parallel 
streams, and then each is run again 
using fork/join operations. The 
results of running the two different 
workloads under these different 
conditions can be found in Table 1, 
which shows GC log parsing exclud¬ 
ing I/O operations, and Table 2, 
which shows GC log parsing includ¬ 
ing I/O operations. 

Before we look at the results 
further, I should give a word of 
caution. There is enough noise in 
these experiments that I wouldn't 
put any stock in the fine differ¬ 
ences. The variation between runs 
was on the order of several sec¬ 
onds, which is why the results are 
coarsened to two significant digits. 
That said, the benchmark appears 
to be good enough to answer the 
following question: Can we do 
better by adapting the size of the 
ForkJoinPool? The answer in this 
case is mixed. 

■ Parallelizing brings benefits 


most of the time. The only 
(expected) loss is when paral¬ 
lelization puts extra pressure on 
the I/O subsystem. 

CPUs are about 20 percent idle 
when the lambda expression 
implicitly uses the fork/join 
framework. Explicitly using the 


fork/join framework soaked up 
that 20 percent, which resulted 
in a corresponding reduction in 
response time. 

The reductions in response time 
don't hold when I/O operations 
are part of the workload. This is 
a reasonable result, because I/O 


operations are many orders of 
magnitude slower than memory 
operations. 

These observations call for 
more experiments. Let's mix the 
CPU- and I/O-bound units of work 
and increase the level of concur¬ 
rency. We can control the number 


TASKS TIME IN FORKJOINPOOL INTER-REQUEST EXPECTED NUMBER OF TOTAL RUNTIME 

SUBMITTED (SECONDS) INTERVAL (SECONDS) TASKS IN FORKJOINPOOL (SECONDS) 

LAMBDA PARALLEL 

20 

2.5 

2.5 

1 

50 

LAMBDA SERIAL 

0 

6.1 

0 

0 

123 

SEQUENTIAL PARALLEL 

20 

1.9 

1.9 

1 

38 

CONCURRENT PARALLEL 

20 

3.2 

1.9 

1.7 

39 

CONCURRENT FLOOD 
(FORK/JOIN) 

20 

6.0 

1.9 

3.2 

38 

CONCURRENT FLOOD 
(STREAM) 

0 

2.1 

0 

0 

41 

Table 1 

TASKS TIME IN FORKJOINPOOL INTER-REQUEST EXPECTED NUMBER OF TOTAL RUNTIME 

SUBMITTED (SECONDS) INTERVAL (SECONDS) TASKS IN FORKJOINPOOL (SECONDS) 

LAMBDA PARALLEL 

20 

2.8 

2.8 

1 

56 

LAMBDA SERIAL 

0 

7.5 

0 

0 

150 

SEQUENTIAL PARALLEL 

20 

2.6 

2.6 

1 

52 

CONCURRENT PARALLEL 

20 

5.8 

3.0 

1.9 

60 

CONCURRENT FLOOD 
(FORK/JOIN) 

20 

43 

6.5 

6.6 

130 

CONCURRENT FLOOD 
(STREAM) 

0 

3.0 

0 

0 

61 


Table 2 

ORACLE.COM/JAVAMAGAZINE ////////////////////////////////////////// MARCH/APRIL 2015 


o 

o 


o 

< 


I 


~3 



CO 


I— 

ZD 

O 

CD 

< 













































//java architect / 



TASKS TIME IN FORKJOINPOOL INTER-REQUEST EXPECTED NUMBER OF TOTAL RUNTIME 

SUBMITTED (SECONDS) INTERVAL (SECONDS) TASKS IN FORKJOINPOOL (SECONDS) 


16 FORK/JOIN THREADS 

20 

210.6 

15.2 

13.83 

304.5 

8 FORK/JOIN THREADS 

20 

27.2 

4.1 

6.65 

81.9 

4 FORK/JOIN THREADS 

20 

14.0 

3.60 

3.88 

72.0 


Table 3 


of threads by setting the property 
-Djava.util.concurrentForkJoinPool 

.common.parallelism=N. Table 3 

shows the results of the mixed 
workload with 4, 8, and 16 
instances of ForkJoinWorkerThreads. 

The results in Table 3 confirmed 
my suspicion that adding more 
threads would not have any sig¬ 
nificant benefit. The CPU utiliza¬ 
tion chart in Figure 1 offers some 
explanation of why this might be 
the case. 

The two tasks were submit¬ 
ted such that subtasks from both 
workloads should have intermixed. 
However, there is a telling sign 
in Figure 1: the big red 
blobs due to overloading 
the kernel with disk I/O 
operations. The first run 
with 16 threads is domi¬ 
nated by kernel activity. 

In contrast, the subse¬ 
quent runs with 8 and 
4 worker threads show 
much less kernel activity. 

I should remind you that 
the exact same work¬ 


load was offered up in all three 
cases. The only difference is the 
number of threads in the common 
thread pool. 

Conclusion 

What these experiments indi¬ 
cate is that there doesn't seem to 
be much benefit in adding more 
threads. In fact, these results show 
that adding more threads and, 
hence, adding more pressure on 
an already stressed I/O system, 
can be exceptionally detrimental 
to performance. 

"Measure, don't guess" tells us 
that when we tune, we always want 
to make informed deci¬ 
sions driven by solid 
measurements. To get 
the needed measure¬ 
ments, we often need 
to improve the observ¬ 
ability of the application 
being investigated. In 
this case, we were better 
able to understand how 
the fork/join framework 
performed by adding 


ADDING THREADS 


Adding threads 

on an already 
stressed I/O 
system can be 

detrimental to 
performance. 




instrumentation. To simplify the 
task of adding instrumentation, the 
source code for ForkJoinPool was 
directly modified. A better way to 
instrument would be to use byte¬ 
code manipulation (through the 
agent framework). 

A project is being incubated at 
Adopt OpenJDK on GitHub and it 
is hoped that it will be included 
in Java as a formal patch. I invite 
you to add other measures—such 
as steal counts (already estimated 
by the fork/join framework) or the 
number of active threads—or any 
other measure you think would 
help improve the visibility of the 
fork/join framework. </article> 
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See the effects of small source code changes and Java HotSpot VM switches. 


O racle's Java HotSpot VM 
contains dynamic com¬ 
pilers. These compilers gather 
statistics from your running 
program that are used to 
optimize performance by 
selectively replacing inter¬ 
preted bytecode with faster 
native code using just-in-time 
(JIT) compilation techniques. 

JITWatch is a free, open 
source tool that analyzes the 
complex compilation log file 
output generated by Java 
HotSpot VM and helps you 
visualize and understand 
the optimization decisions it 
made. Chris Newland devel¬ 
oped JITWatch as part of the 
Adopt OpenJDK project, and 
it is available for download 
from GitHub. Follow Chris 
on Twitter @chriswhocodes 
for updates. 

In Part 2 of this three- 
part series, we explored 
howto make Java HotSpot 


VM produce the informa¬ 
tion JITWatch needs. In this 
article, we walk through some 
of the more advanced Java 
HotSpot VM features and 
explore how you can use the 
JITWatch Sandbox to test the 
effects of code changes and 
configuration settings on Java 
HotSpot VM behavior. 

JITWatch Refresher 

You can download the 
JITWatch binary from GitHub, 
oryou can build it from 
source code using the follow¬ 
ing command: 

I mvn clean install 

Then, you can launch 
JITWatch using ./launchUI.sh 
(for Linux or Mac OS) or 
VlaunchUI.bat (for Microsoft 
Windows). Full instruc¬ 
tions for getting started are 
detailed in the wiki. 


As we look at more- 
advanced Java HotSpot VM 
features, you will find it use¬ 
ful to have the disassembly 
binary (called hsdis) installed 
in your Java runtime environ¬ 
ment (JRE) so that you can 
view the disas¬ 
sembled native 
code the JIT com¬ 
pilers produce. 

Instructions for 
building hsdis can 
be found here. 

Enter the 
Sandbox 

The Sandbox is a 
JITWatch feature 
that lets you edit 
code and then 
compile, execute, 
and analyze the 
Java HotSpot 
VM JIT logs, all 
from within the 
JITWatch appli¬ 


cation. It's intended to help 
developers understand what 
goes on "under the hood" 
of Java HotSpot VM when a 
program is run and to see the 
effects of small source code 
changes and Java HotSpot 
VM switches. 

Note that iso¬ 
lated testing of 
algorithms inside 
the Sandbox 
might not result 
in the same Java 
HotSpot VM 
compiler deci¬ 
sions that would 
be made when 
running the full 
application. That's 
because Java 
HotSpot VM will 
have less profil¬ 
ing information 
on which to base 
its optimization 
decisions. 


ANALYZE THIS 


JITWatch is a free, 
open source 
tool that analyzes 
the complex 


output generated 
by Java HotSpot 
VM and helps 
you visualize 
and understand 
the optimization 
decisions it made. 
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Figure 1 


Once you've launched JITWatch, 
click the Sandbox button in the top 
left of the main window to open 
a Sandbox window like the one 
shown in Figure 1. 

As well as supporting Java 
source code, the Sandbox has sup¬ 
port for Scala and Groovy, because 
these languages compile to byte¬ 
code that executes on the Java 
Virtual Machine (JVM). Other JVM 
languages—such as Kotlin, Clojure, 
JRuby, and JavaScript (using 
Nashorn)—will be supported in 
the future. 

Behind the scenes, JITWatch 
uses a java.Iang.ProcessBuilder to 
compile and execute your code. 

Experimenting with Java 
HotSpot VM Switches 

Let's begin by looking at which Java 
HotSpot VM switches can be used 
to control JIT compilation. 


Click the Configure Sandbox 

button to open the Sandbox 
Configuration window shown 
in Figure 2. You can set up each 
VM language you wish to use in 
the Sandbox by specifying its 
home directory. 

Next, select Show Disassembly 
to instruct the Java HotSpot VM to 
produce human-readable assembly 
language from the JIT-compiled 
native code. This capability requires 
the hsdis binary. 

You can override the default 
setting for the Tiered Compilation 
option, which controls when code 
is optimized. Optimization is done 
first quickly with the Cl client com¬ 
piler and then sometimes again 
with the more advanced C2 server 
compiler after more runtime sta¬ 
tistics have been gathered. The 
Tiered Compilation option is dis¬ 
abled by default in Java SE 7 and 
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>- 



Figure 2 


enabled in Java SE 8. 

Java HotSpot VM manages a 
pointer to an object, called an 
oop (ordinary object pointer), 
that is usually the same size as 
the native machine pointer. So 
on a 64-bit system, the required 
heap space will be larger than 
on a 32-bit system. Java HotSpot 
VM is able to save heap space by 
representing 64-bit pointers as 
32-bit offsets from a 64-bit base; 


this is known as compressed oops. 
Disabling the Compressed Oops 
option simplifies the inspec¬ 
tion of the disassembled native 
code, because it eliminates the 
pointer arithmetic code needed 
to support compressed oops. 
Make sure you understand the 
implications before altering this 
setting in a production environ¬ 
ment. For more information, see 
"Compressed Oops". 
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The FreqlnlineSize and 
MaxInlineSize fields control the 
bytecode size thresholds up to 
which Java HotSpot VM will inline 
small and hot methods, respec¬ 
tively. You can also choose to dis¬ 
able inlining completely. (You might 
find it interesting to measure how 
much performance is lost when 
you do this.) 

Finally, the Compile Threshold 

field sets the number of method 
invocations required before the 
C2 server compiler is invoked, and 
the Extra VM switches field allows 
you to pass any additional tuning 
parameters. 

Hovering over an option will dis¬ 
play a tool tip that shows the exact 
switch passed to Java HotSpot VM. 

JIT Method Inlining 

We'll now use the Sandbox to 
learn about how Java HotSpot VM 
employs a JIT technique called 
method inlining, which is when the 
call to a target method is replaced 
with the method body itself to 
eliminate the invocation overhead. 
This early step is important for 
the JIT compiler, because it brings 
related code closer together before 
further optimizations (such as loop 
unrolling) are performed. 

Java HotSpot VM can inline code 
when the method is called on an 
interface type and the concrete 
types are not known until runtime. 


It does this by counting the number 
of implementations that are used 
during program execution: 

■ When the call is observed as 
being monomorphic (a single 
implementation is seen), the 
code can be simply inlined. 

■ When the call is bimorphic (two 
implementations are observed), 
Java HotSpot VM can inline 
both implementations and use 
a simple condition to choose 
between them. 

■ When the observed behavior is 
megamorphic (more than two 
implementations), lookup tables 
are required and Java HotSpot 
VM cannot inline the method 
implementations. 

Java HotSpot VM adds an 
uncommon trap to the opti¬ 
mized code in case the number 
of observed implementations 
increases and it has to reconsider 
the decisions it made. 

Let's use the example of an inter¬ 
face called Coin that declares a 
single method depositQ and three 
implementations of the interface 
named Nickel, Dime, and Quarter. 
You can observe the different 
behaviors when two implemen¬ 
tations (bimorphic dispatch) and 
three implementations (megamor¬ 
phic dispatch) are seen at runtime. 

In the Sandbox, open the source 
file PolymorphismTest.java, which is 
included in the JITWatch examples. 
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PolymorphismTest.java X 


1 

2 

3 

4 

5 

6 
7 
d 
9 

10 

11 

12 

13 

14 

15 

16 
17 
13 

19 

20 
21 
22 

23 

24 

25 

26 
27 

2a 

29 

30 

31 

32 

33 

34 

35 

36 

37 
3a 

39 

40 

41 

42 


public class PolymorphismTest 
{ 

public interface Cain ( void deposit(); I 
public static int moneyBox * 0; 

public class Nickel implements Cain { public void deposit{) ( moneyBox +— 5; } \ 

public class Dime implements Coin ( public void deposit!) { moneyBox +- 10; } } 

public class Quarter implements Coin ( public void deposit!) { moneyBox +— 25; \ } 

public PolymorphismTest!) { 

Coin nickel - new Nickel!); 

Coin dime » new Dime(); 

Coin quarter - new Quarter!); 

Coin coin * null; 

// change the variable maxlmplementations to control the inlining behaviour 
// 2 - bimorphic dispatch - the method call will be inlined 

// 3 — megamorphic dispatch - the method call will not be inlined 

final int maxlmplementations - 2; 

for (int i * 0; i < 100000; i++) ( 

switch(i % maxlmplementations) { 
case 0: coin “ nickel; break; 
case 1: coin — dime; break; 
case 2: coin - quarter; break; 

coin.deposit(); 

System.out.println("moneyBox: M + moneyBox); 


public static void main{String [ ] args) ( 
new PolymorphismTest(); 


Figure 3 


See the listing shown in Figure 3. 
On line 24, you can control the 
number of implementations that 
are used by setting the variable 
maxlmplementations. 

Click the Run button in the 
Sandbox window to execute the 
program. The JITWatch TriView 


window, shown in Figure 4, opens 
as soon as the Java HotSpot VM log 
file has been analyzed. 

When two of the Coin implemen¬ 
tations are used, hovering over the 
invokeinterface instruction at offset 
93 in the Bytecode pane shows a 
tool tip confirming that the method 
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%/ Source ✓ Bytecode ✓ Assembly Compile Chain JIT Journal 


Bytecode size 

Native size 

Compile time (ms) 

132 


792 


8 


Source 


public static int moneyBcx - 0; 

public class Nickel implements Coin ( public void depositO ( moneyBox f- S; ( ) 

public class Dime implements Coin { public void deposit() { moneyBox +- 10; \ \ 


10 



11 

public 

class Quarter implements Coin ( pubi 

12 



13 

public 

PolymorphismTest() ( 

14 

Coin 

nickel - new Nickel (); 

] 15 

Coin 

dime - new Dime(); 

16 

Coin 

quarter - new Quarter(); 

17 



13 

Coin 

coin - null; 

19 



20 

// change the variable maxlmplementations 

21 

// 2 

- bimorphic dispatch - the method 

:22 

// 3 

- megamcrphic dispatch - the method 

523 



’24 

final int maxlmplementations - 2; 

|25 



52 6 

for 

(int i - 0; i < 100000; i-»-+) { 

!27 

switch (i S. maxlmplementations) ( 

i23 


case 0: coin - nickel; break; 

129 


case Is coin - dime; break; 

|30 


case 2: coin - quarter; break; 

31 

} 


|32 




:34 
35 
;3 6 
37 
!3s 
30 
•40 
41 

•42 ) 


System.cut*println("moneyBox:" + moneyBox), 


\ 


public static void main(String f] args} { 
new PolymorphismTest (); 

\ 


Bytecode (double dick for JVM spec) 


35: 

istore 

5 

37: 

iconst_0 


33: 

istore 

6 

40: 

iioad 

6 

42: 

ldc 

43 // 

44: 

if icmpge 

104 

47: 

iioad 

6 

49: 

iconst_2 


50: 

irem 


51: 

tableswitch ( 

// 0 to 2 


if int 100 


76: 

77: 

70: 

32: 

33: 

35: 

33: 

30: 

01 : 


Q : 76 
1:32 
2 : G8 
default:91 

} 

aload_l 

astore 

goto 

aload_2 

astore 

goto 

aload_3 

astore 

aload 


4 

01 

4 

01 

4 

4 


03: 

101 : 

104: 

107: 

110 : 

111 : 


Inlined: Yes, inline (hot] 

Count: 11264 

iicount* 7232 

Bytes: 9 

Prof factor: 1 


114: 

116: 

I Ctrl -click to 
| Backspace to r 

inspect 

eturn 

this 

method 

119: 

getstatic 

41'5 

77 

Field :r 

122: 

invokevirtual 

416 

if 

Method 

125: 

invokevirtual 

417 

// 

Method 

123: 

invokevirtual 

418 

// 

Method 

|131: 

return 





Figure 4 

was successfully inlined, as shown 

in Figure 4. 

In the Assembly pane, you 
can see the native code inlined 
from the deposit!) method on 
inner class PolymorphismSNickel. 
This adds 5 to the static vari¬ 


Assembly 

0x000000010249c632: 

mov 

Srlld,0x98( 

0x000000010249c639: 

jmpq 0x00000001 

0x000000010249c63e: 

cmp 

Srsi, SrlO 

0x000000010249c691: 

jne 

0x000000010 

0x0 00000 01024 9c693 : 

add 

$0x5,^rlld| 

0x000000010249c697: 

mov 

Srlld,0x98{ 

0x000000010249c69e: 

jmp 

0x000000010 

0x000000010249c6aQ: 

mov 

$0xff£ffff6 

0x000000010249c6a5 : 

xchg iax,Sax 

0x000000Q10249c6a7: 

call 

.q 0x0000000 

0x000000010249c6ac: 

callq 0x0000000 

0x000000010249c6bl: 

mov 

Srl3d f SrlOd 

0x000000010249c6b4: 

mov 

SrlQd,%rl3d 

0x000000010249c6b7: 

mov 

$0xffffffc6 

0x0 0000001024 9c6bc: 

mov 

ircx,Srbp 

0x000000010249c6bf: 

mov 

Srdx, (Srsp) 

0x000000010249c€c3: 

mov 

Srl3d,0x10{ 

0x000000010249c6c8: 

mov 

Srbx,0x13(% 

0x000000010249c6 cq: 

mov 

Sr 14,0x20(V 

0x000000010249c6d2: 

nop 


0x000000010249c6d3: 

call 

.q 0x00000 00 

0x000000010249c6d3: 

call 

.q 0x00000 00: 

0x00000001024 9c6da: 

add 

$0xa,Srlld 


able moneyBox on source line 7 
using the assembly instruction 
at 0x000000010249c693: add 
S0x5,%rlld. 

Lower down in the Assembly 
pane, you can see the correspond¬ 
ing inlined code for adding 10 to 


the static variable moneyBox from 
PolymorphismSDime on source 
line 9 using the assembly instruc¬ 
tion at 0x0000000I0249c6dd: add 

S0xa,%rlld. 

■ » 

To see the exact set of optimi¬ 
zations performed, click the JIT 


Journal button in the top left of the 
TriView screen. The uncommon 
trap for the bimorphic inlining is 
indicated in Figure 5 with a red dot. 

When all three Coin implemen¬ 
tations are involved, the method 
call to deposit!) can't be inlined 


>- 


o 

o 


o 

< 


I 

3 


CO 


o 

CD 

< 


f 

* 


lava 

.net 


blog 



36 


ORACLE.COM/JAVAMAGAZINE / ///////////////////////////////////////// MARCH/APRIL 2015 
































































//java architect / 


<method bytes- 0 9° name- 0 deposit 0 £lags-°l° holder« w 782" id-°793° iicount-°7282 
<call method-"793" inline-°l° count- 0 11264" pro£_£actor-°l°/> 

<inline_success reason=°inline (hot)°/> 

<predicted_call klass= , *781 0 bci="93V> 

<unccmmon_trap reason="null_check 0 bci-°93" acticn-°maybe_recompile"/> 
<predicted_call klass="782° bci="93°/> 

<unccmmon_trap reason-"bimorphic 1 ' bci»°93° action—°maybe_recompile°/> 

Sparse method- 0 793" stamp-"0 * 704" uses-°11264 °> <!— void PolymcrphismTest$Nic 
<parse_done nodes-"264° memcry-°6 61 B4° stamp-"0.704° live—°258"/> 

</parse> 

Sparse method- 0 792" stamp-"0 * 704" uses—°112€4°> <!— void PolymorphismTest$Dim 
<parse_done nodes—"281° memory- 0 53660° stamps"0.704° live—°274°/> 

</parse> 

<bc code-°178" bci-°104°/> 

•iT • irn nm mn r, 4- v 3 m. v-g a enn ‘hin 1 . -» ■ 3 ^ inri 1 * Tvg i ** 1 fi L ** j nf i r-. i - . ** v g i r . f ayr^v o f 11 i nr iov ** C. Pi ** / S - 


Figure 5 


and there is no inlining tool tip at 
bytecode offset 93, as shown in 

Figure 6. 

In addition, you won't find the 
inlined assembly code for deposit!), 
as was seen in Figure 4, and the 
virtual_call for the invokeinterface 

bytecode hasn't been eliminated. 
Also note that the native code size 
of 504 bytes is smaller than with 


bimorphic dispatch, because no 
inlining has occurred to increase 
the size of the native code. 

That's as deep as we want to 
dive this time, but if you'd like to 
know more, I recommend running 
the examples that come with the 
Sandbox and examining the opti¬ 
mizations that Java HotSpot VM 
can make. 
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Bytecode size 


Native size 


Compile time (ms) 


132 
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public interface Coin ( void deposit() 

public static int mcneyBox - 0; 

public class Nickel implements Coin ( 

public class Dime implements Coin { pu! 

public class Quarter implements Coin { 

public PolymorphismTest() { 

Coin nickel ^ new Nickel(); 

Coin dime - new Dime(>; 

Coin quarter - new Quarter{); 

Coin coin « null; 

// change the variable maxImplementa 
// 2 * bimorphic dispatch - che me 
// 3 - megamorphic dispatch - the me 

final int maxImplementations =• 3; 

for (int i * 0; i < 100000; i++) ( 

switch (i % maxlmplementations) { 
case 0: coin - nickel; break; 
case 1: coin «• dime; break; 
case 2: coin - quarter; break; 

> 


34 

35 

36 


System, out .println ("mcneyBox: ° + mon 


Bytecode (double click for JVM spec) 


Assembly 


jrrr 

mVo.«!k sp££ la 

i - rr- 

f t 

«er?rea "raiymcrp 

30: 

astore_3 




31: 

aconst_null 




32: 

astore 

4 



34: 

iconst_3 




35: 

istore 

5 



37: 

iconst_0 




39: 

istore 

6 



40: 

iload 

6 



42: 

ldc 

#8 

// 

int 100000 

44: 

i£_icmpge 

104 



47: 

iload 

6 



49: 

iconst_3 




50: 

irem 




51: 

tableswitch 

0:76 

1:82 

2:88 

default:91 

} 

aload_l 

( // 0 

to 2 


76: 




77: 

astore 

4 



79: 

goto 

91 



82: 

aload_2 




83: 

astore 

4 



85: 

goto 

91 



88: 

aload_3 




89: 

astore 

4 



91: 

■■■ 

aload 

4 





BBBH 







101:" 

Ctrl-cltck to 

inspect 

this 

me thod 11 i 

104: 

Backspace to 

return 


■ . .. 

107: 

new 

TTT 

—rr 

class' 1 j ava/lang 

110; 

H’in 





0x0G00000109aecd2f: 

test Sr9,%r9 


0x0000000109aecd32: 

je 0x0000000109aece23 


0x0000000109aecd38: 

mov 

0x8(%r9), %rl0 


0x0000000109aecd3c: 

movabs $0x2592c5db8,^rdi ; (metadata('PolymorphismTest! 

0x0000000l09aecd46: 

cmp 

%rai,%rl0 


0x0000000109aecd49: 

jne 

0x0000000109aece35 ;*iload 




; - PolymorphismTest::<init>@40 (] 

0x0000Q00109aecd4f: 

jmp 

0x0000000109aecd93 


0x0000000109aecd51: 

mov 

%rcx,0x18(%rsp) ;*aloaa 




; - PolymorphismTest::<init>@91 

(lint 

0x0000000109aecd56: 

mov 

%rcx,0x10(^rsp) 


0x0000000109aecd5b: 

mov 

%rll, 0x8(Irsp) 


0x0000000109aecd60: 

mov 

%r9, (%rsp) 


0x0000000109aecd64: 

mov 

fcebx,%ebp ;*tableswitch 




; - PolymorphismTest::<init>051 (line 

27) 

0x0000000109aecd66: 

mov 

0x18(^rsp),Srsi 


0x0000000109aecd6b: 

xchg iax,iax 


0x0000000109aecd6d: 

movabs $0xff££ff££f£f£ff££ r %rax 


Gx0000000109aecd77: 

callq 0x0000000109abaf60 ; CopMap{[0]-Ocp [8)-0op (16]-Oc 



;*invokeinterface deposit 




; {virtual_call| 


0x0000000109aecd7c: 

inc 

%ebp ;*iinc 




; - PolymorphismTest::<init>@98 (line 26) 


0x0000000109aecd^e: 

mov 

0x18(Srsp),%rl3 


0x0000000109aecd93: 

mov 

4ebp,%ebx 


0x0000000109aecd85: 

mcv 

(%rsp),%r9 


0x0000000109aecd39: 

mov 

0x9(%rsp) , %rll 


0x0000000109aecd8e: 

mov 

0x10 (Sr3p),Srcx ;*iload 




; - PolymorphismTest::<init>@40 

(lint 

0x0000000109aecd93: 

cmp 

$0xl86a0 f ^ebx 


0x0000000109aecd99: 

jge 

0x0000000109aece00 ;*if_icmpge 




; - PolymorphismTest::<init>@44 (! 

0x0000000109aecd9b: 

movslq iebx,%r8 



GO 


O 

CD 

< 
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JarScan Tool 

The JITWatch download includes a 
tool called JarScan that will scan a 
list of JAR files and count the byte¬ 
code size of every method and con¬ 
structor it finds. The purpose of this 
tool is to highlight methods that 
have more bytes of bytecode than 
the Java HotSpot VM threshold for 
inlining "hot" methods, so you can 
get hints on where to benchmark 
to see whether decomposing code 
into smaller units can increase the 
program's performance. 

The JIT compiler determines the 
"hotness" of a method using a set 
of heuristics that include the meth¬ 
od's call frequency and the number 
of times back branches (such as on 
a loop) are taken. 

There are limits on what can be 
inlined, including a limit on the 
bytecode size of the called method. 
This behavior can be tuned using 
the Java HotSpot VM switch 
-XX:FreqInlineSize=n (the default 
"hot" bytecode size threshold is 
325 bytes on 64-bit Linux). 

More details on the JarScan tool 
can be found in the wiki. 

Just because a method's byte¬ 
code size exceeds FreqlnlineSize 
doesn't guarantee that it's a per¬ 
formance bottleneck. The code 
might never be invoked enough 
times to reach the hot code limit. 
(Remember that JarScan is a static 
analysis tool and has no knowl¬ 


edge of how methods are used.) 

We will now explore using the 
JarScan tool to examine the rt.jar 
file, which contains the core Java 
libraries (such as java.lang.* and 
java.util.*) from the latest version 
of Java. 

Run the command shown in 
Listing 1. This command might take 
a couple of minutes to complete, 
because the tool disassembles 
each method in the JAR file's class 
files using the API form ofthejavap 
program called com.sun.tools.javap 
.JavapTask. 

In Java SE Development Kit 8, 
Update 31 (JDK 8u31) on 64-bit 
Linux, 3,605 methods exceed 
the FreqlnlineSize threshold, 
and many of these aren't a sur¬ 
prise. For example, java.util 
.GregorianCalendar.computeFieldsO 
has a lot of work to do that justifies 
its 1,571 bytes of bytecode. Some 
of the results, however, might raise 
an eyebrow because they include 
core library methods—such as 
java.lang.String.toUpperCase() and 
java.lang.String.toLowerCase()— that 
you might expect to find inside 
tight loops. These both weigh in at 
439 bytes of bytecode. 

The Javadoc explains that this 
is due to the requirement to sup¬ 
port character sets in all lan¬ 
guages: "Since case mappings are 
not always 1:1 char mappings, the 
resulting String may be a different 
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./jarScan.sh /home/chris/jdkl.8.0_31/jre/lib/rt.jar > methods.csv 


Q Download all listings in this issue as text 
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length than the original." So they 
contain code to check and resize 
the underlying character arrays, 
if needed. 

This spawned the idea that if 
your application domain doesn't 
require the flexibility the core 


libraries offer and performance 
is critical, creating your own less- 
flexible version of a core library 
method can result in significant 
performance improvements. 

As a test, a method was writ¬ 
ten that converts String objects 


to uppercase. It works only on 
the ASCII characters a through z, 
and its performance is measured 
against java.lang.String.toUpper 
Case using the OpenJDKJMH 
benchmarking framewor k. Run this 
benchmark on the command line, 


Gass: 


org.adoptopenjdk.jitwatch. benchmarks.Uppercase 


>/ Source >/ Bytecode ✓ Assembly Compile Chain 

JIT Journal 

Bytecode size 

Native size 

Compile time (ms) 


69 

1080 


11 


Source 
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2 
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9 

10 

11 
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13 
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13 

19 

20 
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33 
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40 


package org .adoptopero dk. jitwatch.benchmarks; 

import java.util.concurrent.TimeUnit? 

import org.openjdk.jmh.annotations.Benchmark; 

import org.openjdk.jmh.annotations.BenchmarkMcde; 

import org.openjdk.jmh.annotations.Mode; 

import org.openjdk.jmh.annotations.OutputTimeUnit; 

import org.openjdk.jmh.annotations.Scope; 

import org.openjdk.jmh.annotations.State; 

import org.openj dk.jmh.infra.Blackhole; 

Estate{Scope.Thread) 

BOutputTimeUnit(TimeUnit.SECONDS) 

8BenchmarkMcde(Mode.Throughput) 
public class Uppercase 
{ 

private static final String SOURCE * "Lcrem ipsum dolor sit arnet, 
+ “Ea eius porro signiferumque ius, illud eligendi vim ea." 

+ “Has ad euismoa evertitur pertinacia, at labore cmittam duo 
+ “Earn dicat eligendi pericula in. Vel cibo probatus in.“ 

+ "Mei ea impetus persius, et facer laudem minimum mel.°; 

@Benchmark 

public String convertstring(Blackhole bh) { 
return SOURCE.toUpperCase(); 


@Benchmark 

public String convertCustom(Blackhole bh) { 
return toUpperCaseASCII(SOURCE); 


public final String toUpperCaseASCII(String source) { 
int len - source.length () ; 

char(] result — new charflen]; 

for (int i - 0; i < len; i-M-) ( 


Bytecode (double click for JVM spec) 


Q: 

aload_l 



1: 

invokevirtual 

#6 

// Method java/lang/String.length:OI 

4: 

istore_2 



5: 

iload_2 



6: 

newarray 

char 


3: 

aatore_3 



9: 

iconst_0 



10: 

iatore 

4 


12: 

iload 

4 


14: 

iload_2 



15: 

if_icmpge 

€0 


13: 

aload_l 



19: 

iload 

4 


24: 

iatore 




26: 

iload 

5 

::_ 

23: 

bipush 

97 

C-jur.t: 6 614 

B v-fes : 2 9 if |HH| 

30: 

if icmplt 

48 

33: 

iload 

5 


35: 

bipush 

122 


37: 

if icmpgt 

48 

I v / ’^>4, 

40: 




iload 

5 

42: 

hipush 

32 


44: 

isub 



45: 

i2c 



46: 

iatore 

5 


43: 

aload_3 



49: 

iload 

4 


51: 

iload 

5 


53: 

castore 



54: 

line 

4, 1 


57: 

goto 

12 


€0: 

new 

#3 

// class java/lang/String 

63: 

dup 



64: 

aload_3 



65: 

invokeapecial 

#9 

// Method j ava/lang/String. I, <init>”[C)V 

63: 

areturn 




rather than from the Sandbox, and 
load the resulting Java HotSpot VM 
log file into JITWatch for analysis. 

The code for this benchmark is 
shown in Listings 2a and 2b. 

It turns out there is a heavy price 
to pay for supporting all character 
sets. The custom version achieved 
more than 4.5 times as many oper¬ 
ations per second as the core library 
version, as shown in Listing 3. 

Analyzing the log file from 
JMH shows that the method 
toUpperCaseASCII generates 
69 bytes of bytecode and every 
method call it makes was success¬ 
fully inlined (highlighted in green, 
as shown in Figure 7). 

Conclusion 

That wraps up this series on some 
of the optimizations that Java 
HotSpot VM can perform. Check 
out the "Performance Techniques" 
section of the wiki, and play with 
the examples that come with the 
JITWatch Sandbox. 

If you have any questions or a bug 
to report, contact Chris through the 
GitHub project. </article> 
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The Quantum Physics of Java 

An introduction to modern chip design and its effect on Java programs 
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I magine you have an array 
with 67,000 integer ele¬ 
ments and you run two loops 
over the array, as shown in 
Listing 1. Both loops multiply 
the elements of the array by 
three. However, while the first 
loop changes every element, 
the second loop modifies only 
every sixteenth element. 

How much faster will the 
second loop be compared to 
the first? Take a guess! 

The surprising answer is 
that if the code is executed on 
a typical laptop, both loops 
take the same amount of 
time. Table 1 shows measure¬ 
ments from three computers. 
The difference is negligible. 
The second loop does only 
a fraction of the work, so 
how is it possible that the 
first loop runs as fast as 
the second? 

To understand this 
behavior you have to con¬ 
sider how the CPU and the 
memory system work. On 
the lowest level, modern 


computers can show surpris¬ 
ing behavior, very much like 
quantum mechanics, that 
seems to contradict daily 
experience. But sometimes 
quantum mechanics has 
noticeable effects on our "real 
world." And sometimes the 
effects of processes at the 
hardware level have a notice¬ 
able effect on our programs. 
This article takes a look at 
how modern computers 
work at the lowest level and 
explores the things that can 
affect performance. 

Instructions 

If you try to imagine how the 
loops in Listing 1 are exe¬ 
cuted, your initial interpreta¬ 


tion might be that the array is 
stored in main memory and 
the CPU reads element after 
element, multiplies each ele¬ 
ment by three, and writes the 
result back, as you can see in 
Figure 1. This interpretation is 
useful for understanding the 
functionality of the loops, but 
it's not what really happens 
inside a computer. 

Figure 2 shows a graph 
of relative performance 
improvements that CPUs 
and memory went through in 
recent decades. Memory per¬ 
formance improved steadily 
during the whole period, but 
that was nothing compared 
to the improvements in CPU 
speed, especially during the 


SMALL STEPS LARGE STEPS EVERY 

EACH ELEMENT 16TH ELEMENT 

I7-4980HQ @ 2.8 GHZ, MAC OS X YOSEMITE 

30.4 MS 

29.7 MS 

17-3770 @3.4 GHZ, LINUX MINT 14 

25.8 MS 

26.1 MS 

T7200 @ 2 GHZ, LINUX MINT 14 

193.0 MS 

184.2 MS 


Table 1 


1990s. In recent years, plain 
CPU speed hit a limit, but do 
not be fooled! The scale in 
Figure 2 is logarithmic. Even 
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Figure 3 


though it might look as if memory 
performance is catching up, the gap 
is still huge. 

For our example, this means that 
if a computer worked as we imag¬ 
ined in Figure 1, it would be terribly 
slow. The CPU would wait most of 
the time for the memory to deliver 
the next element. To overcome this 
bottleneck, processor designers 
added a cache between the CPU 
and main memory. The cache is a 


smaller and much faster memory 
module, whose whole purpose is 
to mitigate the performance gap. 
Figure 3 shows an improved model 
of the CPU and memory system. 

Programs tend to access the 
same data and code several times 
within a short period of time ( tem¬ 
poral locality), and memory access 
is often limited to small regions 
(.spatial locality). This means that 
if you load all the data that you 
use into the cache, there is a high 
chance that you'll need it again 
later. And because the next time 
you need it the data is available in 
the cache, the performance of your 
programs increases tremendously. 

Now you might wonder—if we 
can put faster memory between 
CPU and main memory, why can't 
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LISTING 1 


private static final int ARRAY_SIZE = 64 * 1024 * 1024; 
public int[] array = newint[ARRAY_SIZE]; 


for (inti = 0, n = array.length; i < n; i++) { 
arrayfi] *= 3; 

} 


for (inti = 0, n = array.length; i < n; i+=16) { 
array[i] *= 3; 

} 


Q Download all listings in this issue as text 


we just make the whole memory 
faster? There are mostly two rea¬ 
sons. For one, main memory is a 
lot larger than cache, and it sim¬ 
ply takes more time to find the 
right address within 16 GB (the 
typical size of main memory, as of 
this writing) than to find the right 
address within 8 KB (the typical 
size of Level 1 [LI] cache). However, 
probably the more important rea¬ 
son is that the electronic compo¬ 
nents of cache (SRAM) are much 
more expensive than the ones used 
in main memory (DRAM) in terms 
of heat and space. Heat and space 
are the limiting factors in modern 
chip design. 

To exploit spatial locality, the 
cache doesn't work with individual 
bytes but uses cache lines instead. 


A cache line is an adjacent part of 
the memory, typically 64 bytes. 

What really happens when you 
iterate over the large loops from 
Listing 1 can be seen in Figure 4. 
The CPU loads a complete cache 
line from main memory into the 
cache and modifies the elements 
in the cache directly. The first loop 
modifies all elements in the cache 
line, while the second loop modifies 
only one element (16 integers, each 
4 bytes long). The limiting factor in 
this setup is loading the cache line 
into the cache; it almost doesn't 
matter how many operations you 
execute on each cache line. This 
explains why the performance of 
both loops is roughly the same. 

Counting instructions to estimate 
the performance of an algorithm is 
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a useful approximation, because it's 
easy to assess and usually gives a 
good indication. But as this exam¬ 
ple shows, you have to keep in mind 
that it's just an approximation. In 
reality, the execution times of single 
instructions vary widely, and you 
can't rely on this number only. 
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Figure 4 


Data Size 

Does the size of a data structure 
affect performance? To answer this 
question, run a small experiment 
using the code in Listing 2. Take 
the second loop from the first code 
example and run it repetitively. 

This time, change the size of the 
array and measure the average time 
to run a single loop iteration. The 
purpose of this experiment is to 
run a trivial algorithm over a data 
structure whose size you can con¬ 
trol. Is there a relationship between 
the size of the array and the time 
needed to modify a single element? 

Before you look at the results, 
consider briefly what you expect. 
Accessing a single array element 
requires constant time, 0(1). Thus, 
the inner part of the loop should 
be executed in constant time, too. 
That means for large enough arrays, 
you will hit an upper bound that is 
constant. But what happens before 
that? Will the execution time be 
constant all the way through? 

Figure 5 shows the dependency 
between array size and access time. 
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LISTING 2 


private static final int ARRAY_CONTENT = 777; 

@Pa ram ({"1024", "2048", "4096", "8192", "16384",..., "536870912"}) 
public int size; 

public int[] a re¬ 
public int counter; 
public int mask; 

@Setup(Level.Iteration) 
public void setUp() { 
final int elements = size / 4; 
final int indexes = elements / 16; 
mask = indexes -1; 
array = newint[elements]; 

Arrays.fi 11 (a rray, ARR AY_CO NTENT); 
counter = 0; 

for (int i = 0; i < indexes; i++) { 
seqlndexfi] = 16 * i; 

} 

} 

(©Benchmark 
public void benchLoopQ { 
array[16 * counter) *= 3; 
counter = (counter + 1) &r mask; 

} 


Q Download all listings in this issue as text 


As you can see, there is a relation¬ 
ship between these values. A single 
modification is faster if the array 
is small. But it's not that simple. 
The resulting curve looks like a 
staircase. The access time remains 
constant until the array size 
exceeds a specific threshold, and 


then it jumps to a new level where 
it remains until the next threshold 
is reached. Why is there a depen¬ 
dency at all, and where do these 
levels come from? 

The cache is usually not a single 
unit, but consists of several lev¬ 
els with different sizes and access 
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Figure 5 


CPU 


UiE&A&CHY 



Main Memory 


Figure 6 

times. LI cache is the smallest 
and fastest. Current CPUs typi¬ 
cally have three cache levels, with 
each level being slower and signifi¬ 
cantly larger than the level before. 
Figure 6 shows an improved ver¬ 
sion of our model that contains the 
cache hierarchy. 

How large are the performance 
gaps between the different cache 
levels? To explain this in a form 


that is more accessible to human 
beings, my former colleague 
Richard Thompson came up with 
the beer cache hierarchy. Imagine 
that you're sitting in front of your 
TV watching your favorite team and 
you're thirsty. 

■ LI cache is the bottle of beer in 
your hand. Access time is almost 
immediate (< 1 ns), but the 
quantity is extremely limited (for 
example, 32 KB on my system). 

■ L2 cache is the cooler next to 
your sofa. Access time is still 
pretty low (7 ns), and the quan¬ 
tity is significantly larger (256 KB, 
which is equivalent to 8 bottles 
of beer). 

■ L3 cache is the fridge in the 
kitchen. Access time is noticeably 
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LISTING 3 


public int[] rndlndex; 

@Setup(Level. Iteration) 
public void setUp() { 


rndlndex = newintfindexes]; 
final List<Integer> list = new ArrayListo(indexes); 
for (int i=0; i<indexes; i++) { 
list.add(16 * i); 

} 

Collections.shuffle(list); 
for (int i=0; i<indexes; i++) { 
rndlndex[i] = list.get(i); 

} 


P Download all listings in this issue as text 


larger (25 ns), but the size is so 
large that the analogy falls apart 
(8 MB, which is equivalent to 256 
bottles of beer). 

■ Main memory is the corner store. 
Access time is huge (100 ns), but 
the quantity of beer is probably 
more than enough for a lifetime 
(16 GB, which is equivalent to 
more than half a million bottles 
of beer). 

Looking at these numbers, it 
becomes quite obvious why both 
loops in the initial example took the 
same amount of time. It doesn't 
really matter how many sips of beer 
you drink if you have to run to the 
corner store for each bottle. 


With the cache level hierarchy in 
mind, take a look at the graph in 
Figure 5. Each plateau in the graph 
corresponds to a level of the cache 
hierarchy. As long as the array fits 
into LI and L2 cache, access time 
is very low. But as soon as the array 
becomes too large and has to be 
read from L3 cache, access time 
increases noticeably. And the same 
happens again as soon as the array 
does not fit into L3 cache and has 
to be read from main memory. If 
you look closely, you can even see 
the small jump between the LI and 
L2 cache. 

Size does matter. Even though 
memory is cheaper than ever 
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before, try to avoid wasting it. The 
smaller the size of the data you 
use, the higher the chance that 
it will fit into the cache, which 
can lead to significantly better 
performance. 


Access Patterns 

So the size of data influences per¬ 
formance. Does the order in which 
you access your data—the data 
access pattern—have an influence, 
too? You can change the previ¬ 


ous experiment slightly to find an 
answer. Instead of simply running 
an index through the array, create a 
second array that stores the access 
order. Access the array sequentially 
as before for one time, and then 
access it randomly and measure 
the difference. You can see the code 
for both experiments in Listing 3. 

If you run the experiment with 
different array sizes and plot the 
result in a graph, you get two 
curves, as shown in Figure? Not 
surprisingly, you can see the already 
familiar staircase pattern. Both 
curves show similar access times 
on the lower two levels, which cor¬ 
relate to the LI and L2 caches. But 
on the third level, the performance 
of the sequential access pattern is 
noticeably better. The fourth level 
shows a significant difference. Why 


SequentiaI Access Random Access 



_ KB KB KB KB KB KB KB KB KB KB MB MB MB MB MB MB MB MB MB MB 

Figure 7 


michael@desktop -/repos/quantum/target/classes $ 

perf stat -p 3195 -B -r 40 sleep 

5 




Performance counter stats for process id 

'3195' 

(40 runs): 






4997,277608 task-clock 

# 

0,999 

CPUs utilized 

( 

+ - 

0,00% 

) 

[100,00%] 

529 context-switches 

# 

0,106 

K/sec 

( 

+ - 

0,04% 

) 

[100,00%] 

0 CPU-migrations 

# 

0,000 

K/sec 

[100,00%] 



0 page-faults 

# 

0,000 

K/sec 

( 

+ - 

35,51% 

) 


19.429.967.708 cycles 

# 

3,888 

GHz 

( 

+ - 

0,00% 

) 

[83,28%] 

19.006.043.778 stalled-cycles-frontend 

# 


frontend cycles idle 

( 

+ - 

0,00% 

) 

[83,31%] 

18.296.349.545 stalled - cycles - backend 

# 


backend cycles idle 

( 

+ - 

0,01% 

) 

[66,74%] 

1.386.651.837 instructions 

# 

0,07 

insns per cycle 







# 

13,71 

stalled cycles per insn 

( 

+ - 

0,05% 

) 

[83,37%] 

23.934.690 branches 

# 

4,790 

M/sec 

( 

+ - 

0,07% 

) 

[83,37%] 

23.139 branch-misses 

# 

0,10% 

of all branches 

( 

+ - 

11,00% 

) 

[83,30%] 

5,000586757 seconds time elapsed 




( 

+ - 

0,00% 

) 



Figure 8 


is the access order insignificant for 
small arrays, but plays a major part 
for large arrays? 

A Valuable Tool 

To get a better understanding of 
what is going on inside the com¬ 
puter, you can use the Linux pro¬ 
filer tool perf, which collects and 
prints out events generated by 
the CPU and the memory system 
while a program is executed. The 
command-line interface for perf 
is similar to that of git. You call 
perf with the command you want 
to execute: 

| perf COMM AN D[ARGS] 

To get a list of all commands, use 
perf -help. To get help for a specific 
command, you can run: 

| perf help COMMAND 

The most useful command is 
stat. It allows perf to run another 
program and tracks hardware 
events during execution: 

| perf stat [ARGS] PROGRAM 

Without any arguments, this 
command starts PROGRAM, tracks 
some general events, and prints 
out the statistics as soon as the 
program ends. You can see a 
typical output in Figure 8. You can 
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specify which events should be 
tracked by using the -e option. To 
get a list of supported events, run 
the command: 

| perflist 

In Java, you usually don't want to 
track the whole program, because 
this would include bootstrapping 
the VM, JIT compilation, and so on, 
which typically we're not interested 
in. However, with perf you can hook 
into a running process with the -p 
option. Perf is a little weird when 
used with the -p option, because 
you still have to specify a program 
that will be executed, and the mea¬ 
surement ends once this program 
ends. Typically you'd use the sleep 
command, which enables you to 
specify the duration of the test. 

First, measure both your loops 
using the default settings of perf. 


This provides a good overview. 

The most-interesting results can 
be seen in Table 2. The number 
of stalled front-end and back-end 
cycles differs significantly between 
both runs. A stalled cycle means 
the CPU is idle and waiting for 
something. One of the most likely 
causes of a stalled front-end cycle 
is a cache miss, which results in the 
CPU waiting for data to arrive from 
main memory or a slower cache. 

To validate this assumption, you 
can run perf again, but this time to 
specifically measure cache loads 
and cache misses. You can see the 
result in Table 3. 

The ratio between success¬ 
ful cache loads and cache misses 
differs tremendously. When 
accessing the array in sequential 
order, only about 6 percent of all 
memory loads result in a cache 
miss. But when accessing the 



SEQUENTIAL ACCESS 

RANDOM ACCESS 

CYCLES 

19 , 430 , 435,800 


19 , 429 , 967,708 


STALLED FRONT-END CYCLES 

7 , 217 , 361,632 

( 37 . 14 %) 

19 , 006 , 043,778 

( 97 . 82 %) 

STALLED BACK-END CYCLES 

843 , 462,646 

( 4 . 34 %) 

18 , 296 , 349,545 

( 94 . 17 %) 


Table 2 


SEQUENTIAL ACCESS RANDOM ACCESS 

LI CACHE LOADS 

5 , 758 , 001,370 


170 , 655,221 


LI CACHE MISSES 

360 , 757,378 

( 6 . 27 %) 

365 , 959,699 

( 214 . 44 %) 


Table 3 
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array in random order, two out of 
three loads result in a cache miss. 
The high number of cache misses 
is expected, because the array 
doesn't fit into the cache, and you 
have to load everything from main 
memory. But why are there almost 
no cache misses when you access 
the array sequentially? 

Loading data from main memory 
into the cache is often a major 
bottleneck. For this reason, the 
CPU tries to help by guessing which 
data you'll use next and loading it 
into the cache in the background, 
as you can see in Figure 9. While 
you modify the elements of a 
cache line, functionality called the 
prefetcher loads the next cache 
line into the cache. Thus, when you 
need the data, it's already available 
in the cache. 

The prefetcher is not par¬ 
ticularly smart. It can guess the 
next memory location correctly 
only if the memory loads fol¬ 
low a regular pattern. In the first 
case, when you went through 
the array sequentially, guessing 
the next memory location was 
easy, and the prefetcher could 
mitigate a substantial part of the 
performance loss by prefetching 
the next memory location. But 
when you accessed the array ran¬ 
domly, guessing the next memory 
location correctly was impos¬ 
sible, and the algorithm had to 



MEANWHILE 
PREFETCH 
NEXT CACHE 
LINE 

anGGHD 


Figure 9 


wait until data was loaded from 
main memory. 

The access pattern has a signifi¬ 
cant influence on the performance 
of an algorithm, but the chances to 
use this knowledge within Java are 
limited. You have close to no con¬ 
trol over how your data is arranged 
in memory. But there is hope. The 
proposed value types might provide 
this ability one day, because they 
allow you to arrange object-like 
structures sequentially in memory. 

Another Valuable Tool 

Microbenchmarking is another 
valuable tool for getting more 
insight into your programs. 

Probably the most important 
rule of microbenchmarking is to 
always use a tool that helps you to 
avoid some of the many pitfalls, 
such as the insufficient warmup 
of the VM, dead code elimina¬ 
tion, and loop unrolling. The Java 
Microbenchmark Harness (JMH) 
from Oracle is probably the best 
harness available right now. 
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ANNOTATION 

DESCRIPTION 

@BENCHMARKMODE 

SPECIFIES WHAT SHOULD BE MEASURED—FOR EXAMPLE, THROUGHPUT OR AVERAGE TIME. 

@OUTPUTTIMEUNIT 

SPECIFIESTHETIMEUNITUSED INTHEOUTPUT-FOR EXAMPLE,TimeUnit.MILLISECONDS. 

iWARMUP 

SPECIFIES THE WARMUP PHASE. YOU CAN SET THE NUMBER OF ITERATIONS AND THE 

DURATION OF A SINGLE ITERATION. 

©MEASUREMENT 

SPECIFIES THE MEASUREMENT PHASE AND IS SIMILAR TO ©WARMUP. 

©FORK 

SPECIFIES HOW OFTEN YOU WANTTO FORKTHE JAVA VIRTUAL MACHINE (JVM) AND RUN THE 
TESTS. YOU SHOULD ALWAYS DO RUNS IN SEVERAL FORKS. 


Table 4 


Tests written for JMH are simi¬ 
lar to JUnit tests. The code you 
want to benchmark needs to be in 
a single method, which must be 
annotated with (^Benchmark. The 
test can be configured with anno¬ 
tations at the class level. Table 4 
shows the most-important anno¬ 
tations and their meaning. 

Conclusion 

Most of the time, processes at the 
hardware level have no significant 
effect on programs, but some¬ 
times they do. Therefore, it's use¬ 
ful to have a rough understanding 
of what goes on at the hardware 
level and to keep up with the latest 
developments. Besides being use¬ 
ful, this knowledge is also fascinat¬ 
ing and a great way to impress your 
fellow developers. 

This was the first part of a 
two-part series about the quan¬ 
tum physics of Java. This part 


focused on memory and, spe¬ 
cifically, the cache hierarchy. The 
second part looks a little more into 
memory and then takes a deep 
dive into the inner workings of a 
modern CPU. </articie> 
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3 Billion 

Devices Run Java 
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Shakespeare Plays Scrabble 

New ways of solving problems with the Stream API in Java SE 8 


T he Collections Framework, 
first released with java 2, 
was designed on the Iterator 
patterns (among others) 
and is at the heart of all our 
applications. With the inclu¬ 
sion of generics in java 5, the 
Collections Framework was 
rewritten, but the patterns for 
using it weren't modified. 

java SE 8 introduced 
lambda expressions and the 
Stream API. The Stream API 
brings new patterns and new 
ways of solving the prob¬ 
lems we encounter in busi¬ 
ness applications that were 
previously solved with the 
Collections Framework. This 
article shows how you can 
use the Stream API to solve a 
simply stated problem built 
on the Scrabble game in just 
a few lines of code. 

Shakespeare and Scrabble 

Let's say we have two files. 
One contains (supposedly) 
all the words Shakespeare 
used in his plays and poems. 


The other contains all the 
words in The Official Scrabble 
Players Dictionary. 

The question is this: How 
well would Shakespeare have 
played Scrabble? Or, what is 
the best word Shakespeare 
could have played, and how 
much would it have scored? 

It turns out that this ques¬ 
tion is a map/fiIter/reduce 
question that can be solved 
with the Stream API. 

Compute the Score of a 
Single Word 

Computing a score is about 
taking a given word and 
returning a score that is an 
integer. This can be done with 
Function<String, Integers 
which can be implemented 
with a lambda expression. 

Let's build an array, letter 
Scores, which gives the score 
of each letter in the English 
Scrabble game (Listing 1). 
From this array, we can see 
that the score of m is 3, and 
the score ofyis4. 


Inside this scoring function, 
we also need a mapping func¬ 
tion to get the score of a given 
letter. Let us write this map¬ 
ping function first. It is very 
simple and uses the old trick 
of reading the correct cell of 
the array that holds the scores 
of the letters (letterScores). 

I IntUnaryOperator letterScore = 
letter -> 

letterScores[letter - ’a 1 ]; 

This mapping function is 
an IntUnaryOperator, a spe¬ 
cialized type of function that 
takes and returns an int. 

We can then write the scor¬ 
ing function using the map¬ 
ping function: 

Function<String, 

Integer> score = word -> 
word.charsQ // IntStream 
.map(letterScore) 

•sum(); 

Having the chars() method 
added to the String class 


proves handy. Many classes 
have new methods that 
return streams on their inter¬ 
nal structure. 

The choice has been made 
to represent letters by inte¬ 
gers, and the chars() method 
returns an IntStream instead 
of a Stream<Integer>. 

Three streams of primitive 
types (IntStream, LongStream, 
and DoubleStream) are cre¬ 
ated for performance rea¬ 
sons. Using them is much 
more efficient than using the 
equivalent streams of objects, 
because we don't have to 
pay the cost of boxing and 
unboxing their elements at 
each step. But sometimes we 
need to convert this IntStream 
into a Stream<Integer>, which 
is the case when we have to 
build histograms later. 
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How Shakespeare Performs 

Now that we can compute 
the score of a given word, 
building the histogram of 
Shakespeare's words by their 
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scores is easy. A histogram is a 
simple map, computed by group¬ 
ing the words by their scores. Thus, 
the keys of that map are the scores 
of the words, and the values are 
the words themselves, regrouped 
into lists. 

Are we interested in all the con¬ 
tent of the histogram? No. What we 
want to see is only the best scores. 
So we need to select the best key- 
value pairs from that map—with 
the best being the ones that have 
the greatest values for their keys. 

A simple way of computing that is 
like this: 

1. Build the map. 

2. Sort the map according to its 
keys, in descending order. 

3. Extract the best (for example, 
the first three) key-value pairs. 

Build the map. Building a map by 
regrouping is done with a collector. 
Collector is a new interface in the 
Java SE 8 Stream API that provides 
a mutable reduction. Mutable 
here means that we reduce our 
stream into a mutable container— 
a map, in the example shown in 
Listing 2. 

Sort the map according to its keys. 
In Listing 2, shakespeareWords 

is a collection of all the words of 
Shakespeare, from which a stream 
can be obtained. We can regroup 
these words by the scores of each 
word. The Collectors.groupingByO 
code does just that. It exists in 


several versions, taking different 
sets of parameters. Here, we pass 
a supplier to build the resulting 
map, which is implemented by a 
TreeMap. We pass a comparator 
as a parameter to its construc¬ 
tor to have the map sorted in the 
descending order of its keys. The 
last parameter is a downstream 
collector to specify that we want 
our words in a list. 

We added an extra filtering step 
to make sure that we process only 
words that are allowed in Scrabble. 
Extract the best key-value pairs. 
Selecting the best key-value pairs 
can easily be done with the Stream 
API, but we need to use a trick 
here, because the Map interface 
doesn't define a stream() method. 
We can't build streams directly 
on maps in Java SE 8. The trick 
is to build that stream on the 
set of the key-value pairs, which 
can be built using the entrySetQ 
method call. This method returns 
a Set<Map.Entry<K,V», on which 
we can build a stream. Because we 
built a TreeMap, this set is in fact a 
NavigableSet, which keeps the order 
of the key-value pairs. Displaying 
the three best elements becomes 
an easy task: 

wordsByScore.entrySet() 

.streamO 

Jimit(3) // or any value 

.forEach(System.out::println); 
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private static final int [] letterScores = { 

// a, b, c, d, e, f, g, h, i, j, k, I, m, n, o, p, q, r, s, t, u, v, w, x, y, z 
1, 3, 3, 2,1,4, 2,4,1, 8, 5,1, 3, 1,1, 3,10,1,1,1, 1,4, 4, 8,4,10) 


^3 Download all listings in this issue as text 


At this point, the best word is 
whizzing, which is worth 33 points. 

Not All Words Can Be Written 

Take a closer look at that word 
whizzing. Can we really write that? 
Notice that it contains two occur¬ 
rences of the letterz, but only one 
is available in the Scrabble game. 
How can we discard such a word? 

What we need to do to tell 
whether a given word can be writ¬ 
ten with the letters that are avail¬ 
able in the game is to count them, 
letter by letter, and compare the 
number of needed letters to the 
number of available letters. If this 
test fails, then the word cannot 
be written. 

This operation can be done with 
an allMatch() call. Counting all the 
letters of a given word consists of 
building a histogram in which the 


keys are the letters, and the values 
are the number of times each let¬ 
ter is used to write that word. Let 
us first build this histogram (see 

Listing 3). 

The call to the boxed() 
method converts the IntStream 
returned bythecharsQ call into 
a Stream<Integer>. This is neces¬ 
sary because the collect(Collector) 
method we want to use is 
defined on streams of objects, 
not on streams of primary types 
(IntStream, LongStream, and 
DoubleStream). 

Then we need to build a stream 
on its key-value pairs and compare 
those entries with the available 
letters in the game, as shown in 
Listing 4. In each pair, the key is 
the letter and the value is the num¬ 
ber of times that letter is used in 
the word. 
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The number of available letters 
in the game is given by reading 
the array scrabbleAvailableLetters, 

declared at the beginning of 
Listing 4. From this array we can 
see that nine i's are available, and 
only onez. 

Using this predicate, we can now 
add a filtering step in the computa¬ 
tion of the best word. 

At this point, the best word is 
squeezes, which is worth 26 points. 


What About the Blanks? 

Thinking again, though, we realize 
that we could write the word whiz¬ 
zing. The game uses blank letters, 
and we could use one for the sec¬ 
ond z. Using blanks has a double 
impact. First, it changes the way 
we filter out the words. Second, it 
changes the way we compute the 

scores of the 
words, because 
a blank does not 
score any points. 
Write a word 
using blanks. 
Computing 
the numberof 
blanks needed 
to write a given 
word is also a 
matter of com¬ 
paring the his¬ 
togram we built 
for the given 
word to the 


THINK SOLUTIONS 


Java SE 8 
requires us to 

think about 
the solutions 

for our problems 
differently. As a 
bonus, we get 
cleaner and more- 
efficient code. 


histogram of the available letters 
in the game. Instead of comparing 
the number of letters one by one, 
we count the number of letters that 
are greater than the ones available. 
Subtracting the needed letters from 
the available letters does what we 
need: If the result is negative, then 
we don't need a blank. If not, then 
the result is the number of blanks 
needed for that letter. All we need 
to do is compute which is greater, 
this number or zero, for each let¬ 
ter of the given word. Summing 
the results gives us the number of 
blanks needed for that word. 

First, write a function that takes 
a key-value pair from the word's 
histogram and returns the num¬ 
ber of blanks for a given letter (see 
Listing 5). We can then use this 
function to compute the number 
of blanks for a word, as shown in 
Listing 6. 

We can now add a filtering step in 
the computation of the best word, 
built on the following predicate, 
knowing that we have two blanks in 
the game: 

I Predicate<String> checkBlanks = 
word -> 

nBlanks.apply(word) <= 2; 

Compute the score, including 
blanks. Computing the score of a 
word can't be conducted with the 
scoring method we used earlier, 
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ToLongFunction<Map.Entry<Integer, Long» blank = 
entry -> 

Long.max( 

OL, 

entry.getValue() - 

scrabbleAvailableLetters[entry.getKey() - ’a 1 ] 

); 


£3 Download all listings in this issue as text 


because all the letters don't nec¬ 
essarily count. What we can do 
is, once again, use this histogram 
and multiply the score of each 
letter with the number of the let¬ 
ter used, maxed by the number of 
available letters. 

Let us write a function that com¬ 
putes the score of a given key-value 
pair from this word's histogram. 

See Listing 7. We can then use the 
function created in Listing7 to 
compute the score of a word, as 
shown in Listing 8. 

Using the new scoring function 
created in Listing 8 instead of the 
first one we created, and adding 
the filtering step to check whether 
the word can be written with two 
blanks, confirms that squeezes is 
the best word with 26 points. The 
first word we found, whizzing, isn't 
far away, with 23 points. 

Place the Word on the Board 

But our problems aren't quite over 
yet. What about placing the word 


on the board? We have two special 
squares, not far from the central 
one, that provide a bonus because 
they are marked "DOUBLE LETTER 
SCORE." Would it be possible to 
take those squares into account 
to compute the final score of our 
first move? 

First, notice that one of these 
squares is four squares away to 
the right of the central square. So 
it will give a bonus to a word that 
contains five or more letters. For a 
word that has between five letters 
and seven letters (the maximum 
any player ever has), there are three 
placement possibilities to the right 
of the central square and three 
other possibilities to the left of the 
central square. So, for a seven-letter 
word, optimizing the placement of 
the word determines the highest- 
scoring letter from the word's first 
three and last three letters. 

To compute the final score, we 
need to find this special letter, add 
the normal score that we previ- 
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ously computed, multiply by two 
(for the first-move bonus), and 
add 50 points if the word is seven 
letters long (the bonus for using 
all our letters in one word). Of 
course, the hard part is to find the 
special letter. 

The first and last three letters of 
a given word can be extracted with 
the code shown in Listing 9. These 
two streams hold the letter that 
will be double-scored. Note that 
if the word we consider is shorter 
than four letters, the streams will 
be empty. But that's not a problem, 
because the stream.max() method 
returns an instance of Optional, 
which can be empty. 

How can we merge those two 
streams into one? We have two 
solutions for that: Use either 
the concat() method or the of() 
method. Because the concat() call 
is stateful, we might think it won't 
be as efficient in parallel as of(), so 
let's use of() (see Listing 10). 

This merged stream of letters still 
needs to be mapped to a stream of 
scores for the letters, and then the 
maximum score needs to be deter¬ 
mined. Because there are words 
of fewer than four letters, the two 
streams can be empty and merg¬ 
ing them will also return an empty 
stream. So we should ensure that 
we cope with an empty Optional 
correctly. If we have no letter that 
can be placed on the "DOUBLE 
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Function<String, IntStream> first3 = 
word -> word.chars().limit(3); 

Function<String, IntStream> Iast3 = 
word -> word.chars().skip(Integer.max(0, word.length() -4); 


P Download all listings in this issue as text 


LETTER SCORE" square, this func¬ 
tion should return zero, meaning 
that no bonus will be awarded. See 

Listing 11. 

We can then write our last 
method, which computes the full 
score of the best first move, as 
shown in Listing 12. 

We should also add another 
filter step to remove words longer 
than seven letters, because seven 
is the maximum number of let¬ 
ters allowed in a Scrabble word. 
With this new scoring function, 
we can see that there are two best 
words, jezebel and quickly, for a 
score of 120 points. Not bad for an 
opening move! 

Conclusion 

The problems we solved in this 
article might look like trivial game 
problems, but they solve many of 
the problems we face in general 
business applications: building his¬ 
tograms, comparing them, extract¬ 
ing maximum values, and han¬ 
dling corner cases raised by empty 


streams. Using the Stream API, we 
can solve these problems without 
much code—and the solution is 
still efficient from a pure perfor¬ 
mance point of view. 

We used some of the new pat¬ 
terns introduced in Java SE 8, but 
the Iterator and the if-then-else 
patterns are absent from the solu¬ 
tion. As we can see here, Java SE 8 
requires us to think about the solu¬ 
tions for our problems differently. 
As a bonus, we get cleaner and 
more-efficient code. 

The Official Scrabble Players 
Dictionary and the words in the 
works of Shakespeare, as well 
as many other interesting data 
sets, can be found on Robert 
Sedgewick's page "Real-World 
Data Sets ." </article> 


/learn more 

• Stream API 

• Collector interface 

• Java 8 
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BIG DATA 
AND JAVA 

Experts answer your questions. 

BY FABIANE NARDON, FERNANDO BABADOPULOS, 
DANIEL TEMPLETON, SIMON ELLISTON BALL, 

AND CHRIS RICHARDSON 


W ith tools such as Apache Hadoop, Apache 

Hive, Apache Pig, Apache HBase, and others, 
Java has become the driving force behind 
the big data revolution. As a Java developer, 
are you prepared to be part of this new generation of Java 
applications? Do you have a million questions and nobody 
to ask? See how these big data experts answered the best 
questions posed by the audience during the "Big Data and 
Java: Ask the Experts" panel at JavaOne 2014. 

Fabiane Nardon, a Duke's Choice Award winner, is the 
chief scientist at Tail Target, where she architects new 
disruptive data science services. Fernando Babadopulos, 
an expert in designing high-performance software, is 
the CTO at Tail Target. Daniel Templeton has focused 
on big data since 2009 and works at Cloudera build¬ 
ing its developer and data science Cloudera Certified 
Professional certifications. Simon Elliston Ball, a solutions 
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Comparing Hadoop to a traditional relational system, Daniel Templeton (top) says, “There 
might be more differences than similarities.” Fabiane Nardon (bottom): “The choice of 
enterprise-ready NoSQL database you need depends on the problem you want to solve.” 
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“Big data... can’t 
be processed 
by traditional 
data processing 
techniques,” says 
Java Champion 
Chris Richardson. 


engineer at Hortonworks, helps cus¬ 
tomers solve problems with Hadoop. 
Chris Richardson, a Java Champion, is 
the author of POJOs in Action and the 
founder of the original Cloud Foundry, 
a Java platform as a service (PaaS) 
for Amazon EC2. He now consults 
on microservices and works on his 
third startup. 
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RICHARDSON: 

“Big data isn’t necessarily a large volume of data. It can be data 
that is generated at a high velocity. Big data can also be data 
that has a lot of variety, such as unstructured data.” 


Question: What is big data? 

Richardson: Big data is a somewhat 
nebulous term that describes data 
that can't be processed by traditional 
data processing techniques, such as an 
RDBMS-based application running on a 
single machine. Big data isn't necessar¬ 
ily a large volume of data. It can be data 
that is generated at a high velocity. Big 
data can also be data that has a lot of 
variety, such as unstructured data. 

Big data technology, such as Apache 
Hadoop, tackles the problems of vol¬ 
ume and velocity by scaling horizontally 
using fault-tolerant software, which 
tends to be cheaper and more scalable 
than the more traditional approach of 
vertically scaling very reliable hardware. 
Apache Hadoop deals with variety by 
using storage formats that support 
both unstructured and structured data. 
Machine learning (ML) algorithms are 
commonly used to process big data. 
Question: What are the main NoSQL 
databases? 

Nardon: There are plenty of enterprise- 
ready NoSQL databases out there. The 


choice depends greatly on the prob¬ 
lem you want to solve. Considering the 
popular data model classification, some 
good examples are 

■ Column store: Apache Cassandra, 
Apache HBase, and Apache Accumulo 

■ Document store: MongoDB, 

CouchDB, and RavenDB 

■ Key-value store: Redis, Riak, 

Amazon DynamoDB, Aerospike, and 
FoundationDB 

■ Graph store: Neo4i. Titan. OrientDB. 
Apache Girapja, and InfiniteGrafah 

Question: What is the best way to get 
started with Apache Hadoop? 
Templeton: Hadoop is complex, and 
getting started with it can be daunting. 
However, many tutorials and training 
options are available to help you. 

First, go buy (and read) a copy of 
Hadoop.- the Definitive Guide. It's the 
single, most referenced and useful text 
on Hadoop. 

Online, you'll find many tutorials. The 
venerable Yahoo! Hadoop Tutorial is still 
a good place to start. You should also 
look at the "Hands-on Hadoop" lab, 


























































BABADOPULOS: 

“Because Hadoop 
needs to set 
up the job and 
synchronize all 
nodes to process 
the data, the more 
data it has to 
process at once, 
the more efficient 
it will be.” 
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Fernando Babadopulos, CTO, and Fabiane Nardon, chief scientist, both at Tail Target, 
discuss how much data you need to start using Hadoop and NoSQL. 


which was session H0L4041 atJavaOne 
2014. At the time of this article, Oracle 
had not yet posted the lab materials 
online. The full lab is available on GitHub. 
The lab manual includes instructions 
for performing the lab outside of the 
JavaOne environment. The slides used 
during the lab are also online. 

For more hands-on training, Cloudera 
has the most popular set of training 
courses, with options such as data 
analysis with SQL on Hadoop, introduc¬ 
tion to data science, and Apache Spark. 

To get your hands dirty, Cloudera 
offers Cloudera QuickStart VM and 
Hortonworks offers Hortonworks 
Sandbox. Both are VM images that 
include a complete single-node clus¬ 
ter. Cloudera also offers Cloudera 
Live, which is a live, dedicated cluster 
hosted for free (for up to two weeks) in 
the cloud. 

An easier entry point for Hadoop 
than MapReduce is a higher-level proj¬ 
ect, such as Apache Hive, Apache Pig, 
or CJoudera Impala. Because Hive and 
Impala use a language that is close to 
SQL, the learning curve is often shal¬ 
low. Tutorials for all three are online. The 
Hortonworks Sandbox includes Hive 
and Pig. The Cloudera QuickStart VM 
and Cloudera Live include all three. 

If you're just getting started with 
Hadoop, you should also take a look 
at Apache Spark, which is en route to 
replace MapReduce as the core compu¬ 
tation engine in the Hadoop ecosystem. 
The available resources for learning 


Spark are still somewhat limited. A good 
source of learning materials is the latest 
UC Berkeley AMP Camp archive. Spark 
is also much easier to simply download 
and play with, because it offers a stand¬ 
alone mode. Spark is included in the 
Cloudera QuickStart VM, Cloudera Live, 
and the Hortonworks Sandbox. 
Question: How much data do you need 
to start using Hadoop and NoSQL? 
Babadopulos: It depends on the kind of 
job Hadoop is running. Because Hadoop 
needs to set up the job and synchro¬ 
nize all nodes to process the data, the 
more data it has to process at once, the 
more efficient it will be. To optimize 
space inside the Hadoop Distributed 
File System [HDFS], files must be 


bigger than the size specified by the 

dfs.namenode.fs-limits.min-block-size 

property. 

Nardon: With NoSQL databases, 
many times the decision of using one 
depends more on the need to use a 
more flexible data model than on the 
amount of data itself. For example, 
even if you don't have a huge docu¬ 
ments database, it might be more 
convenient and natural to store your 
documents in a MongoDB database 
than in a traditional relational database, 
because the MongoDB data model is 
optimized for document-like structures. 
Question: How can you move data into 
Hadoop if you aren't using NoSQL? 
Nardon: If you use a relational database 
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and want to move the data into HDFS, 
process it, and then maybe move the 
result back to the relational store, you 
can use Apache Sqoop . Alternatively, 
you can use the Hadoop DBInputFormat 
component to get relational data 
into Fladoop. 

If you have your data in log files, you 
can use Apache Flume to get the data 
into HDFS. 

Question: Is it possible to use Apache 
Fladoop without needing to code in 
Java? 

Templeton: Absolutely! There's a long 
list of options that let you use Fladoop 
without Java. What follows is a quick 
summary of the most-popular options. 

■ SQL on Fladoop: You now have a 
myriad of options to run SQL or SQL- 
like queries against your data stored 
in Fladoop. The two most popular are 
Hive and Impala, both of which are 
Apache-licensed open source projects. 

■ Apache Spark: Spark is the newest 
addition to the Fladoop family and 
might eventually replace MapReduce 
as the job execution engine. Spark 
supports code written in Scala, 
Python, and Java. 

■ Hadoop streaming: Hadoop includes 
a package called streaming that uses 
a simple input/output contract to 
let you write MapReduce jobs in any 
language you like. In fact, a common 
use of Hadoop streaming is to exe¬ 
cute standard Linux commands such 
as grep, cut, and awk against a large 
data set. 

PHOTOGRAPH BY BOB ADLER/GETTY IMAGES 


■ Pig Latin: Another common non-Java 
option is Apache Pig, which lets you 
manipulate data in a high-level lan¬ 
guage called Pig Latin. 

Question: What is the largest amount 
of data that Hadoop can handle? 
Babadopulos: Hadoop can store data 
inside a proprietary distributed file 
system called HDFS as well as in cloud- 
based storage systems—for example, 
Amazon's Simple Storage Service [S3], 
The largest amount of data Hadoop 
can handle is limited by the amount 
of data those file systems can store. 
Because HDFS is a distributed file sys¬ 
tem, the cluster can be scaled up to 
support more data as needed. 

A DataNode can also use more 
than one disk to optimize and 
increase storage. The storage 
capacity of HDFS clusters can 
be increased either by adding 
a new DataNode to the cluster 
or by adding disk volumes to 
existing DataNodes. 

Question: How does Apache 
Hadoop differ from a traditional 
relational system in terms of 
infrastructure? 

Templeton: There might be 
more differences than similari¬ 
ties. The following are the two 
most significant infrastructure 
differences: 

■ Hadoop is built on an inher¬ 
ently distributed model. 

Hadoop assumes that clus¬ 
ters are composed of a large 


number of inexpensive, unreliable 
nodes, and so the entire system is 
built around redundancy and fault 
tolerance. The typical relational sys¬ 
tem is built as a centralized system 
that runs on a small number of pow¬ 
erful and reliable nodes. 

In Hadoop, the data is completely 
decoupled from the metadata or 
table structure. This approach is 
called schema on read. For example, 
you can upload data into HDFS and 
then create a table over that data in 
Apache Hive. The table metadata 
exists only in the Hive metastore; 
the data itself is unaffected. If you 


TEMPLETON: 


“Hadoop is complex, 
and getting started 
with it can be 
daunting. However, 
many tutorials and 
training options are 
available to help you.” 



“An easier entry point for Hadoop... is a higher-level project, such as Apache Hive, 
Apache Pig, or Cloudera Impala,” says Daniel Templeton at Cloudera. 
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“The Lambda Architecture 
is a much-debated topic 
in the big data community 
at the moment,” says 
Simon Elliston Ball at 
Hortonworks. 


PHOTOGRAPH BY JOHN BLYTHE 


change your mind about the table 
schema, you can drop the table (with¬ 
out affecting the data) and create a 
new one. You can even create more 
than one table over the same data 
files. Relational systems are schema 
on write , meaning that the data and 
the table metadata are inextricably 
bound together when the data is 
brought into the system. 

Question: What is the overhead of 
using Hive over Hadoop? 

Elliston Ball: Hive works by building 
a series of MapReduce steps from the 
SQL queries. This is the main source of 
overhead. Hive was designed to run very 
fast over large data sets—as soon as it 
gets started—but it isn't the fastest out 
of the gate. Something else to note here 
is the recent progress on Hive perfor¬ 


mance with Apache Tez. The big differ¬ 
ence Tez makes is to optimize the graph 
of MapReduce jobs created by Hive. In 
addition, it skips an important source of 
overhead: Hive used to have to write a lot 
of intermediate data out to disk between 
each MapReduce step. Tez can remove a 
lot of this overhead, if you turn it on. 

Long-running Tez sessions and reuse 
of containers can take several seconds 
off the startup time, and a wide range 
of tuning options also help reduce 
overhead from the Java Virtual Machine 
[JVM] startup. Most distributions don't 
turn on all the optimizations, so the 
most common cause of overhead at 
the moment is just not turning on the 
new features. 

Question: Can you talk about the 
relationship between the Lambda 
Architecture and Hadoop? 

Elliston Ball: The Lambda Architecture 
is a much-debated topic in the big 
data community at the moment. Even 
before the ink dried on Nathan Marz' 
and James Warren's book—Big Data: 
Principles and best practices of scalable 

realtime data systems— which outlined 
the architecture, developers were lining 
up on either side to critique it. 

The approach works by separating the 
batch and speed layer of your applica¬ 
tion. Hadoop has traditionally been all 
about batch processing, so for abso¬ 
lutely up-to-the-second analytics, you 
needed a streaming system on top to 
do intermediate updates on the earlier 
batch results. A serving layer would 
then combine them. 


The best use for the Lambda 
Architecture is when you have a slow- 
running algorithm running in a batch 
across a large data set, which can be 
approximated by a short-running ver¬ 
sion over a stream of updates. Periodic 
runs of the batch approach are then 
used to get a more precise answer. An 
example is a recommender system 
that performs complex math over your 
entire buying history, which can be 
incrementally improved using a simpler 
formula on today's transactions and 
then be brought in with the batch algo¬ 
rithm by an overnight job. 

Hadoop provides a perfect plat¬ 
form for implementing the Lambda 
Architecture. The batch layer can be 
done in MapReduce, or something 
like Hive and Pig, while Apache Storm 
provides a good platform to imple¬ 
ment the real-time speed layer, using 
HBase as the serving layer. With YARN, 
all this lives on one cluster. Spark— 
and Spark Streaming—also provides 
a great framework for the Lambda 
Architecture, and because of its micro¬ 
batch approach, it might allow some 
reuse of code between the layers. 
However, if you are reusing a lot of code, 
you're probably not using the Lambda 
Architecture as it was intended. 
Question: Is there "Hadoop as a 
Service"? 

Babadopulos: Yes, some companies pro¬ 
vide Hadoop as a service—for example: 

■ Amazon EMR deploys a Hadoop 
cluster inside an Amazon EC2 
infrastructure. 
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NARDON: 

“Many companies 
use Hadoop in 
production right now 
to run important 
services, here is a 
thriving ecosystem 
around it.” 


■ Quboje can deploy Hadoop as a ser¬ 
vice (and many other tools around 
the Hadoop ecosystem, such as 
Hive, Pig, Apache Oozie, and Sqoop) 
either using Amazon EC2 or GoogLe's 
Compute Engine. 

■ Xplentv has an intuitive interface for 
designing ETL dataflows and also 
provides Hadoop with a few clicks. 

■ Microsoft's HDInsight also pro¬ 
vides Hadoop as a service using the 
Microsoft Azure cloud platform. 

Question: What can you say about 
machine learning algorithms? 
Richardson: The big idea in ML is the 
development and use of algorithms 
that learn from data by constructing 
models that make predictions or deci¬ 
sions. For example, a common use 
of ML is to make product or article 
recommendations based on a user's 
past behavior. One widely used ML 
library is Apache Mahout. The scalable 
algorithms implemented by Apache 
Mahout were originally built using 
Hadoop MapReduce. However, more- 
recent implementations are based on 
the much faster Apache Spark. Apache 
Spark also has its own ML library called 
MLjib. A key part of many applications is 
a big data pipeline, which extracts data 
from an online application into the big 
data system and runs ML algorithms to 
construct models, which are then used 
by the online application. 

Question: What about tools for gover¬ 
nance in big data? 

Elliston Ball: Forgoing governance is 
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“The big idea in ML is the development and use of algorithms that learn from data by 
constructing models that make predictions or decisions,” says Chris Richardson. 


no longer an option. In the early days of 
Hadoop, there weren't a lot of options 
for good data governance and security. 
However, now governance is very much 
part of the mainstream data architec¬ 
ture as a key tool. 

A lot of things come under the banner 
of governance—for example, ensuring 
security, controlling and auditing access 
to data, managing metadata, and man¬ 
aging data lineage—but essentially it 
boils down to people and policies. The 
Hadoop platform has tools that assist 
with these tasks, but often good gov¬ 
ernance comes down to a culture of 
metadata awareness. 

On the tools front, each major dis¬ 
tribution handles issues of governance 
in a different way. Cloudera has a pro¬ 
prietary solution (Cloudera Manager 
and Cloudera Navigator). Hortonworks' 
tools are 100 percent open source, with 


Apache Ranger, Apache Knox Gateway, 
and Apache Falcon. Tools from tradi¬ 
tional ETL vendors, such as Informatica 
and Talend, also work well to provide 
governance-related services. 

Question: Is Hadoop enterprise-ready? 
Nardon: Yes! Many companies use it in 
production right now to run important 
services. There is a thriving ecosystem 
around it. Many commercial companies 
offer services for Hadoop, such as dis¬ 
tributions, support, hosting, consulting, 
and so on. A few of them are Cloudera, 
Hortonworks, MapR, and Amazon. 
Richardson: There are also newer com¬ 
panies, such as Databricks. </article> 


/learn more 

• " Big Data and lava: Ask the Experts " 

panel at JavaOne 2014 
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Improving the Performance of 
Java EE Applications 

Incorporate performance tuning into your development lifecycle. 

JOSHJUNEAU 



BIO 




Java and 
Performance 


I magine that your appli¬ 
cation, after months of 
development time and lots 
of testing, is finally ready for 
production deployment. The 
application goes live, and for 
the first few minutes every¬ 
thing seems like a success. 

Suddenly you receive a 
phone call from one of the 
users telling you that they 
can't access the applica¬ 
tion, because it appears to 
load indefinitely. Moments 
lateryou receive another call, 
and yet another. It seems 
as though your application 
doesn't allow concurrent 
access to enough users. This 
is a nightmare scenario in 
which the application devel¬ 
opment team must make 
critical performance tuning 
decisions underthe gun, and 
they must make each of the 
changes to the production 
environment directly. 


Where do you begin in this 
situation? What could be 
wrong? Is a server configura¬ 
tion preventing enough con¬ 
nections for your production 
environment, or does the 
application code contain a 
bottleneck that causes the 
users to wait? 

Many applications are not 
tuned for performance until 
after they've been put into 
production. Unfortunately, 
many organizations do not 
deem performance tuning 
to be an essential part of the 
development lifecycle, but 
rather, treat it as a triage 
step when things go wrong. 
To make matters worse, 

Java EE performance tun¬ 
ing can sometimes be 
much like finding a needle 
in a haystack. The cause of 
the performance issue can 
lie just about anywhere, 
from the application source 


code to a server configuration. 

In this article, you'll see 
some of the processes that 
you can use for tuning your 
Java EE applications pro¬ 
actively, so you can try to 
avoid scenarios like this one. 

Approaching the Beast: 
Proactive Tuning 

Performance tuning should 
be made part of the standard 
development lifecycle. As 
such, applications should be 
designed from the ground up 
with performance in mind. 


To be forward-thinking about 
performance means to con¬ 
sider carefully all approaches 
for the implementation of 
application solutions, not just 
the fastest approach or the 
one that is easiest to imple¬ 
ment. Java EE applications 
can be particularly difficult to 
develop around performance, 
because several points of con¬ 
tention in a Java EE environ¬ 
ment can add performance 
burdens to an application. 

The top performance issues 
experienced in Java EE appli- 


Information 



The calibration was successful. 

Click Show Details to see calibration results. 

Warning: If your computer uses dynamic CPU frequency switching, 
please disable it and rerun calibration as changing the CPU frequency 
during profiling would produce inaccurate results. 


Show Details 
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cation environments tend to be 
related to configuration issues and 
environment issues. Oftentimes, 
the applications themselves are 
coded fine, but the application 
server to which they are deployed 
is not tuned correctly or it is config¬ 
ured inappropriately for the applica¬ 
tions or the intended user capacity. 

To tune properly for a production 
employment, perform the following 
steps in the order listed: 


■ Application tuning 

■ Server tuning 

■ Java runtime tuning 

■ Server operating system and 
platform tuning 

Next, focus on decreasing 
the chance of incurring perfor¬ 
mance issues after a production 
deployment. 

Coding for performance. There 
are bevies of coding situations 
that might lead to performance 


overhead in a Java EE application. 
Java EE applications are executed 
concurrently, which can lead to 
bottlenecks, because users might 
be competing for resources such 
as web services or databases. Each 
remote call can add latency, and 
processes such as serialization can 
be CPU-intensive, causing further 
performance degradation. With 
these kinds of issues in mind, you 
need to craft Java EE applications 


carefully, ensuring that proper 
resource handling is used. 

An application's performance 
tuning should begin with the 
source code. Even though the top 
causes of performance issues in 
Java EE applications point to the 
environment, proper coding can 
still play a key role in an application 
that performs well. 

The following poor-coding prac¬ 
tices, among others, can lead to 
performance issues: 

■ Over-serialization and 
deserialization 

■ Overuse of finalizers 

■ Too much synchronization 

■ Not discarding unused variables 

■ Too many dynamic variables 

■ Rampant use of System.out 
.println() 

■ Sessions that are not released 
when they are no longer needed 

■ Failing to close resources (for 
example, database and network 
connections) 

Performing code reviews is 
imperative for reducing code that 
inhibits an application's perfor¬ 
mance. While poorly crafted code 
can slip past multiple develop¬ 
ers, the more eyes that examine 
it, the better. In addition to code 
reviews, more than one individual 
should run performance and load 
tests against an application, and 
compare results of current tests to 
those run previously. 



Figure 2 
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Figure 3 

Tuning the environment. Many 
environmental factors can make 
a difference in the performance 
of an application. Learn to love 
the administrative console or 
command-line utility for your appli¬ 
cation server, because you spend 
lots of time with it. For GlassFish 
4.x, the default domain is config¬ 
ured appropriately fortesting pur¬ 
poses, but it is likely not appropri¬ 
ate for a production environment 
without further configuration. 

Ensure that deployment settings 
are configured properly for your 
applications. If your application 
server allows autodeployment or 
dynamic application reloading, be 
sure to disable those settings in a 
production environment, because 
they can have a significant impact 


Figure 5 


on performance. 

Also, take note of 
how often logs are 
written, because 
frequent logging 
can cause untimely 
performance issues. 

Consider the 
application server 
configuration set¬ 
tings for enterprise components, 
such as the Enterprise JavaBeans 
(EJB) container, Java Message 
Service (JMS), and Java Transaction 
Service. Always review the default 
configuration settings, and modify 
them accordingly to support a 
production environment. Configure 
server clustering to provide high 
availability through load balancing, 
where appropriate. 


u ® 

ai 




Class Name - Live Objects 

Live Bytes [%] * 

Live Bytes 

Live Object 

chart] 



22,621,040 B 

217 

bytel] lo bource 

■ 

11,482,232 B (13%) 

28 

java Jang. Object!] 

■ 

6,894,984 B 

135 

]ava.util.HashMap$Node 

■ 

5,344,832 B (G%) 

167 

java.utiLHash Map $ Node [] 

■ 

4,343,080 B 

46 

java.lang. String 

i 

4,333,824 B (4.9%) 

180 

int[J 


i 

4,165,176 B (4,7%) 

11 

java.util. Array List 

i 

3,207,216 B (3,6%) 

133 

java. lang. reflect. Method 

i 

2,394,568 B (2 7%) 

27 

java. util. HashMap 

* 

2,357.040 B (2.7%) 

49 

Class Name Filter (Contains) 



Memory Results TIHT7?llB 




Do not treat your database as a 
black box for your data. Database 
access can become a point of con¬ 
tention for applications, either 
when querying a large data set or 
when performing too many small 
queries. Whatever the case, have 
a contact numberforyour DBA, 
who might be able to create an 
index on a database table, or per¬ 
haps even incorporate application 


logic into the database where that 
makes sense. 

Planning for Capacity and 
Preparing an Application 

Many tools can help you prepare an 
application for production release. 
Profiling tools can play a key role 
in an application's development 
lifecycle. Such tools can be used 
to forecast how an application will 
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perform after it's released into pro¬ 
duction under a normal or heavy 
load of users. 

Two such tools are the NetBeans 
profiler (a module of the NetBeans 
IDE) and Apache]Meter. Both of 
these tools can provide useful 
information about the perfor¬ 
mance of an application before it 
deploys to production, and both 
are open source. 

The NetBeans profiler makes it 
easy to learn important runtime 
information about your application 
by providing the ability to monitor 
thread states, CPU performance, 
and memory usage. The Apache 
jMeter desktop application simu¬ 
lates a heavy load on an application 
before it goes into production. 

To try out these useful tools, 
experiment by profiling a sim¬ 
ple Java EE application named 
AcmeWorld. You can download 
the AcmeWorld application from 
GitHub, and then use it to fol¬ 
low along as you read the rest of 
this article. 

Using the NetBeans profiler. The 

NetBeans profiler is built into 
NetBeans IDE 8, so you can use it 
for profiling your applications out 
of the box. To perform an accurate 
analysis of your applications, the 
profiler must first be calibrated to 
obtain data about the targeted Java 
platform and environment. The 
calibration needs to be performed 
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only once, but if you 
make any significant 
changes to the environ¬ 
ment, recalibrating is a 
good idea. 

To calibrate, choose 
Advanced Commands 
-> Manage Calibration 
Data from the Profile 
menu. Next, choose 
the targeted Java plat¬ 
form, and click the 
Calibrate button. After 
the calibration is com¬ 
plete, a success mes¬ 
sage is displayed (see 
Figure 1). 

By using the profiler, you can 
perform a myriad of profiling 
tasks against a NetBeans project. 
This article focuses on the default 
profiling configuration, but other 
options are available. The three 
main options are Monitor, CPU, and 
Memory. Each of those contains 
its own suboptions for performing 
quick or customized profiling. Read 
the "Introduction to Profiting Java 
Applications in NetBeans IDE" for 
more information. 

To monitor the AcmeWorld appli¬ 
cation, right-click the project in the 
Projects menu, and then choose 
Profile, which opens the profiler's 
main window. 

Then, choose Monitor. A few 
options (forthread monitoring and 
lock contention monitoring) are 


disabled by default. 
Accept the defaults 
and then click Run. 
Doing so deploys and 
starts the applica¬ 
tion, and opens both 
a Profiler window 
and a VM Telemetry 
Overview window (see 
Figure 2) within the 
NetBeans IDE. 

The Profiler window 
provides the ability to 
take snapshots, view 
dump heaps, and more. 
Take the opportunity 
to hover over the charts within the 
VM Telemetry Overview window 
to see detailed information about 
the heap or garbage collector at the 
specified time selection. 

After you are comfortable 
with these options, run the pro¬ 
filer again by right-clicking the 
NetBeans project and choosing 
Profile. If the profiler is still active, 
you are prompted to end the cur¬ 
rent session and begin a new one; 
otherwise, you can stop the profiler 
manually by clicking Stop within 
the Profiler window. 

This time, choose CPU, and 
then select the Advanced (instru¬ 
mented) option. The CPU profiler 
allows you to profile the perfor¬ 
mance of all application classes 
or only selected classes. That 
said, choosing the default Quick 


KNOW YOUR APP 


Before production 
deployment, it 

is imperative that 
developers have an 
accurate portrayal of 
how an application 
performs for its 
users. 
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(sampled) option for the CPU pro¬ 
filer can sometimes incur too much 
overhead and produce output that 
is not meaningful without perform¬ 
ing extended analysis. 

Click the customize link that 
appears near the end of the 
Advanced (instrumented) option. 
This link opens up the Edit Profiling 
Roots dialog box, which enables 
you to choose which servlets, lis¬ 
teners, or application packages to 
profile. By default, the Select View 
option is set to Web Application 
View, allowing you to select serv¬ 
lets and listeners. Change the 


Select View option to Package 
View, and then expand the tree 
menu to select only the org.javaee7 
.concurrency, org.javaee7.jsf, and 
org.javaee7.session packages, as 
shown in Figure 3. 

Within the Edit Profiling Roots 
dialog box, you can also select 
separate classes, down to the con¬ 
structors and methods, to pinpoint 
your performance measurements. 

Click OK to close the dialog box, 
and then ensure that the Filter 
option is set to Profile only project 
classes. Because enterprise appli¬ 
cations typically invoke Java classes 
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that reside in the target server, it 
is a good idea to limit the profiling 
to those classes that are internal to 
the application. Lastly, click Run. 

When the CPU profiler is run¬ 
ning, the Profiler window and 
VM Telemetry Overview window 
become active. The CPU profiler 
enables you to take snapshots, 
which display the runtime of 
active classes and methods (see 
Figure 4). These snapshots can be 
compared against others to deter¬ 
mine which methods are over¬ 
worked or cause contention. 

The snapshot editor provides the 
option to save the snapshots; view 
by package, class, or method; and 
filter by method name. You can 
also compare two snapshots, and 
when you're attempting to pinpoint 
a performance issue, this profiler 
option can be handy. There are also 
options to view by the "call tree," 
by "hot spots," or by a combination 
of those. 

The NetBeans profiler also pro¬ 
vides the option to profile applica¬ 
tion memory. Again, this feature 
allows you to select between quick 
and advanced profiling. 

For our demo, begin by click¬ 
ing the red stop icon in the Profiler 
menu to stop the current profiler 
task, if you haven't already. Next, 
right-click the project and choose 
Profile, and then select Memory. 
Next, select the Quick (sampled) 
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option and then choose Run. The 
Profiler window opens again, pro¬ 
viding the ability to view telemetry 
information and take snapshots, 
this time of application memory. 

In the snapshot view, the live 
class objects are sorted in the Live 
Bytes [%] column in descending 
order, making it easy to see which 
classes use the most memory. 
Right-clicking a class object pro¬ 
vides the option to go to the source 
code (see Figure 5). 

Now that you understand the 
profiling options, open a browser 
and navigate to the AcmeWorld 
application. Click the 
first example in the 
application, which is 
Current Reservation 
List. This invokes the 
findAIIO method of the 
ParkReservationFacade 
class (see Listing 1). 

After the page loads, 
go back into NetBeans 
and click the Live 
Results option within 
the Profiler window. 

This opens a live results 
editor pane, which 
displays the active 
"hot spot" methods. 

You should see that 
the AbstractFacade 
.findAIIO method is 
on top, because the 
ParkReservationFacade 


class extends AbstractFacade. 

The NetBeans profiler can be a 
big help for locating performance 
problems within an enterprise 
application before production 
deployment. This article just 
scratches the surface of what you 
can do with it. 

Using Apache JMeter. One of the 

most problematic performance 
concerns for a java EE applica¬ 
tion environment is user capacity. 
How many users can access the 
application at the same time? This 
question is impossible to answer 
in the development environment, 
because so many vari¬ 
ables are at play. Even if 
the development envi¬ 
ronment is configured 
exactly the same as the 
production environ¬ 
ment, some incon¬ 
sistencies between 
development and pro¬ 
duction almost always 
occur. User capacity 
testing is often a game 
of chance. However, 
the Apache JMeter tool 
can help you hone your 
capacity testing. 

The JMeter tool is 
easy to use, and it con¬ 
tains useful utilities for 
measuring performance 
and load testing. For 
more information about 


APP DEVELOPMENT 


Java EE 
applications can 
be particularly 
difficult to 
develop around 
performance, 
because several 
points of contention 
in a Java EE 
environment can 
add performance 
burdens to an 
application. 
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LISTING 1 


A 


public List<ParkReservation> currentReservations() { 
return ejbFacade.findAIIO; 

} 


P Download all listings in this issue as text 


installing and using JMeter, seethe 
Apache JMeter User's Manual. 

To begin, open JMeter and create 
a new test plan. Name the test plan 
AcmeWorld Testplan. 

Next, right-click the test plan in 
the left menu, and select Add -> 
Threads (Users). When the Thread 
Group panel is displayed, enter a 
value of 500 into the Number of 
Threads field to represent 500 con¬ 
current users. 

Next, add an HTTP Request 
to the test by right-clicking the 
newly created thread group and 
selecting New -> Sampler -> HTTP 
Request (see Figure 6). In the 
HTTP Request panel, fill in the 
server name and the port num¬ 
ber, and provide a path to test. 

In this case, use /AcmeWorld/ 
faces/currentReservations.xhtml for 
the path. 

The JMeter test is now ready to 
run. Start the application within 
your IDE. Then click the Start button 
within JMeter to run the 500-user 
test on the currentReservations view. 

You can use JMeter to read the 
results of the test by taking a look at 
different listeners. Right-click HTTP 


Request and select Add -> Listener 

to see a list of listeners that can be 
used to view various performance 
measures (see Figure 7). 

The most useful graphs are the 
Response Time Graph (shown in 
Figure 8), the View Results in Table, 
and the Summary Report. 

Putting it all together. The tools 
and techniques discussed so far 
can be used together to provide a 
variety of simulated performance 
and loading tests and help you 

■ Perform code reviews to ensure 
that proper coding techniques 
are used, and to help reduce 
costly mistakes. 

■ Take an abundance of snapshots 
using the NetBeans profiler, and 
save JMeter reports for compari¬ 
son under different environmen¬ 
tal situations. 

■ Perform capacity load test¬ 
ing with more threads than the 
number of users you expect on a 
normal day. 

■ Use JMeter capacity testing and 
take NetBeans profiler snapshots 
in unison to measure the perfor¬ 
mance of your application pro¬ 
cesses under a heavy load. 
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GlassFish Server Open Source Edition 4.1 REST Interface 


• acti v atedsessi onstotal 

o unit: count 
o lastsampletime ; -1 
o name : ActivatcdSessionsTotal 
o count;0 

o description : Total number of sessions ever activated 
o starttime : 1420726822075 

• acti ve serv letsloadedcou nt 

o unit: count 
o current: 4 

q lastsampletime : 1420726823914 
o lefwwaterm&rk : 0 
o name: ActiveServletsLoaded 
o description : Number of Servlets loaded 
o high watermark : 4 
q starttime: 1420726822074 
■ acti vesessionscuiTent 
o unit: count 
q current: 0 
q lastsampletime : -1 
a low watermark : 0 
o name : Active Sessions 
o description : Number of active sessions 
o high watermark : 0 
o starttime: 1420726822075 

• erroreoimt 

o unit: count 
o lastsampletime:-! 
o name : ErmrCount 


Figure 11 


While performing a true produc¬ 
tion test in a development envi¬ 
ronment is nearly impossible, the 
combination of these techniques 
can help you refine your application 
as well as your production environ¬ 
ment so that less time is required 
for reactive tuning. 

Taming the Beast: Reactive 
Tuning 

No matter how well a system has 
been designed for performance, 
something can happen to slow it 
down. More often than not, perfor¬ 


mance tuning is done in a reactive 
manner, so having a plan for per¬ 
forming reactive tuning is impera¬ 
tive. When tuning a Java EE applica¬ 
tion, you must peel back the onion 
of the stack, analyzing and inves¬ 
tigating each layer as the possible 
culprit for application performance 
degradation. 

In the case of GlassFish, the 
server contains built-in monitor¬ 
ing APIs, so you can monitor the 
modules of your Java EE application 
server. Monitoring can be config¬ 
ured with the asadmin command- 
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line utility or through the adminis¬ 
trative console (shown in Figure 9). 

Each of the modules can be 
configured to OFF (the default), 
LOW, or HIGH. After the modules 
are configured, an application can 
be monitored at the console (see 
Figure 10) or with the asadmin util¬ 
ity. The asadmin utility provides the 
added benefit of letting you pipe 
output into a file or a database to 
be analyzed at a later date against 
other saved monitoring data. 

Applications and server mod¬ 
ules can also be monitored on 
the web at the default URL of 
http://<localhost>:4848/monitoring/ 
domain/server, as seen in Figure 11. 
Because the GlassFish monitoring 
API is REST-based, applications 
can be built to interrogate the 
data and display in useful for¬ 
mats. One such utility is LightFish, 
which Java EE expert Adam Bien 
wrote. If you use another Java EE 
application server, search for the 
monitoring tool that best suits 
the environment. 

Careful monitoring of the appli¬ 
cation server using monitoring 
resources, such as those offered 
by GlassFish, is key to pinpoint¬ 
ing performance issues in Java EE 
applications after deployment. 

You must also monitor server 
resources, such as memory and 
CPU usage, to ensure that the 
server is not overloaded. Database 


monitoring might also be useful in 
a reactive tuning toolset. 

Conclusion 

To deem an application successful, 
it should include all of the intended 
functionality and perform well. 
Users that experience poor perfor¬ 
mance often consider the applica¬ 
tion unworkable. 

Incorporating proactive perfor¬ 
mance tuning measures into your 
development lifecycle can allevi¬ 
ate many hours of reactive tuning 
on the tail end. Before production 
deployment, it is imperative that 
developers have an accurate por¬ 
trayal of how an application per¬ 
forms for its users. The NetBeans 
profiler and Apache JMeter are two 
profiling tools that can help you 
forecast how an application will 
perform under a normal or heavy 
load of users after it is released to 
production. </article> 
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JOHAN VOS 


Writing JavaFX Applications for 
Mobile Devices 

With recent JavaFX ports, you can now create Java client applications for mobile devices. 



PHOTOGRAPH BY 
TON HENDRIKS 


1 ava developers can use 
J JavaFX to create great cli¬ 
ent applications that have a 
compelling user interface. By 
leveraging the power of the 
Java platform, developers 
can create client applications 
that address today's require¬ 
ments for performance, as 
well as look and feel. This 
article explores the state of 
JavaFX on mobile devices, 
with a specific focus on the 
Android platform. 

In the past, Java client 
applications focused on desk¬ 
top and laptop environments. 
That made sense, because 
most of the transactions 
started from those environ¬ 
ments. Also, the mobile 
landscape was so fragmented 
that creating applications 
capable of running on most 
mobile devices was difficult 
and expensive. 

Today, the situation has 


changed. End users often use 
mobile devices (smartphones 
and tablets) instead of laptop 
and desktop systems, and an 
increasing number of busi¬ 
ness transactions originate 
from those mobile devices. 

On mobile devices, consum¬ 
ers clearly prefer native appli¬ 
cations over browser-based 
applications. 

The mobile landscape has 
changed significantly over the 
last few years. Today, more 
than 90 percent of mobile 
devices use either the Google 
Android operating system or 
the Apple iOS operating sys¬ 
tem. Both platforms provide a 
convenient way for developers 
to distribute their applica¬ 
tions through an "app store." 
The Apple App Store accepts 
applications created foriOS; 
the Google Play Store is where 
Android applications should 
be uploaded. 


Your 

JavaFX 

Application 


Java Platform 



JavaFX 

Implementation 


Native OS 

(Windows, Mac OS X, Linux) 


Figure 1 


The combination of a 
new mobile landscape 
and JavaFX provides a 
huge opportunity to Java 
developers. With JavaFX 
on mobile devices, Java 
developers can now use 
their favorite language to 
create powerful, modern¬ 
looking applications that 
work on desktops, lap¬ 
tops, mobile devices, and 
embedded systems. 

Porting JavaFX to 
Mobile Devices 

One of the key benefits of 
Java is the "write once, run 
anywhere" paradigm. JavaFX 
is no exception to this rule. If 
you write a JavaFX application, 
the runtime environments 
handle the platform-specific 
issues so you can focus on 
application-specific needs. 

On desktop systems with 
either Microsoft Windows, 


Linux, or Mac OS X, the situa¬ 
tion is as shown in Figure 1. 

Oracle provides JavaFX 
implementations for 
Windows, Mac OS X, and 
Linux. 

The JavaFX porting efforts 
for mobile devices and 
embedded ARM systems are 
coordinated byJavaFXPorts, 
an initiative from the Java 
community. The codebase for 
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JavaFXPorts is based on the code 
for OpenJFX, which is the official 
repository for the JavaFX code avail¬ 
able at https://wiki.openidk.java 
.net/display/OpenlFX/Main. One 
goal of JavaFXPorts is to send all 
the changes that are made to build 
JavaFX on mobile devices back to 
OpenJFX. This ensures convergence 
rather than divergence among all 
implementations of JavaFX. 

The Android platform comes 
with its own virtual machine, 
named the Dalvik Virtual Machine 
(or Dalvik for short). While Dalvik is 
by no means a fully functional Java 
SE 8 implementation, it has many 
similarities to a Java SE 7 runtime. 

As part of the JavaFX Android 
port, the platform-specific JavaFX 
implementation leverages Dalvik. 
Furthermore, the native parts of 
the JavaFX runtime are compiled 


Your 

JavaFX 

Application 


Java Platform 



JavaFX Android 
Implementation 


Dalvik 

_ J 


Android 


Figure 2 


for Android, as shown in Figure 2. 

It is important to note that 
although Java SE 8 is not avail¬ 
able on Android, the JavaFX 8 APIs 
are available. Also, the internal 
implementation of JavaFX 8 on 
Android and iOS does not use Java 
SE 8-specific APIs. 

Moreover, it is possible to use 
lambda expressions, because a 
great tool called RetroLambda 
can convert them (at the byte¬ 
code level). RetroLambda inspects 
class files, and it replaces all 
invokedynamic calls with invoke calls 
that are supported in Java SE 7. 

On iOS, the situation is a bit 
more difficult, because no Java 
runtime is available on the iOS plat¬ 
form. Worse, Apple has restricted 
the just-in-time (JIT) options that 
are used by typical Java Virtual 
Machines (JVMs) and that compile 
bytecode to native code at runtime. 

However, RoboVM provides a 
smart, ahead-of-time compiler 
that translates Java code to native 
iOS code, eliminating the need 
for a JIT processor running on the 
devices. RoboVM offers more or 
less the same base classes as the 
Java classes in the Dalvik Virtual 
Machine, which means that at 
this moment, it offers a significant 
subset of the Java SE 7 platform. As 
a consequence, JavaFX 8 applica¬ 
tions that use Java SE 7 APIs can be 
created on desktop environments 
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and subsequently converted into 
Android oriOS applications. 

In Figure land Figure 2, it is 

clearthat your JavaFX applica¬ 
tion can be executed on desktops 
and laptops, as well as on Android 
oriOS devices. One restriction: 

The Java runtime environments 
on Android and iOS do not sup¬ 
port Java 8-specific APIs at the 
moment. Therefore, avoid using 
them in code that you want to run 
on mobile platforms. 

Creating a JavaFX Application 
on Android 

Writing Java code is one thing; 
deploying that code on mobile 
devices is another thing. Android 
applications are created by per¬ 
forming operations that convert 
Java classes and resources into an 
Android package (a .apk file). This 
Android package can be uploaded 
to the Google Play Store, where it 
can be downloaded by all Android 
users. While doing development, 
you can also attach an Android 
device to your development system 
with a USB cable and transfer the 
Android package to your device. 

Next, see how you can cre¬ 
ate an Android package based on 
your JavaFX code. Note that this 
build process is under continuous 
improvement. See the JavaFXPorts 
website for the latest (and sim¬ 
plest) build instructions. 


Prerequisites. Before you can build 
Android packages, you need to 
have the Android SDK. Install this 
SDK in a directory—for example, in 
/opt/android-sdk. 

You also need to install the addi¬ 
tional packages that provide the 
latest SDK tools, platform tools, 
and build tools, as explained here. 

In addition, you need to install 
the latest javafx-for-dalvik run¬ 
time (also called the dalvik-sdk), 
which can be downloaded from 
this Bit buck et page. Unpack the zip 
file in a directory—for example, in 
/opt/dalvik-sdk. 

The Dalvik SDK contains a 
samples directory that includes a 
sample JavaFX HelloWorld appli¬ 
cation, along with the required 
build scripts to create an Android 
package. You might also want to 
download the samples that are 
part of the JavaFXPorts project, 
which show specific scenarios. 
These samples are available at this 
Bitbucket page. 

The build system. Now that you 
have all the required tools, you can 
create a JavaFX application. This 
article uses the Android-Gradle 
plugin, so I recommend using the 
Gradle build system. However, the 
Android-Gradle plugin cannot be 
combined with the Java-Gradle 
plugin in a single project. Because 
developers often want to run their 
applications on a desktop or a 
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laptop first, it's a good idea to use a 
multiproject Gradle setup. 

The HelloWorld sample in the 
samples directory of the Dalvik 
SDK shows how this is structured. 
We have a root project with three 
subprojects: core, desktop, and 
android. The core subproject con¬ 
tains all code and shared resources 
(for example, .fxml files). It uses the 
java plugin to build the code. 

The desktop subproject contains 
no code or resources, but only a 
simple build.gradle file. The con¬ 
tents are shown in Listing 1 This 
build file specifies that we want to 
create an application that depends 
on the code in the core subproject, 
and that we want the HelloAndroid 
class to be the class that starts 
the application. 

While this desktop subproject is 
extremely simple, it is often good 
practice to have this subproject and 
to use it during development. The 
development lifecycle on Android 
is slightly longer than on a desktop, 
and during development, testing 
the application on a desktop or a 
laptop is often easier. 

The Android Gradle project. The 
third subproject, which is named 
android, is a typical Android Gradle 
project. Gradle is the preferred build 
system for Android applications, 
and the Android-Gradle plugin 
already contains most of the tasks 
and build logic required for creating 


native Android applications. 

The android directory in the 
samples/HelloWorld demo has a 
typical setup for an Android app: 

■ A build.gradle file for building the 
packages. 

■ An AndroidManifest.xml file, 
which contains application- 
specific information. 

■ An assets directory with some 
property files that will be com¬ 
piled into the package. You don't 
have to change those, but they 
should be present. 

Note that a typical Android appli¬ 
cation has more files and directo¬ 
ries, which allows for more custom¬ 
ization of the packages. However, 
in the HelloWorld demo, everything 
is kept to a minimum. One of the 
benefits of the Gradle system is that 
it relies on the "convention over 
configuration" principle. The default 
behavior is sufficient in most cases, 
but there are tons of configuration 
options so you can configure the 
application the way you want it. 

The AndroidManifest.xml file 
in the src/main directory can 
be used to supply information 
about the application. At a mini¬ 
mum, the information shown in 
Listing 2 must be specified in this 
manifest file. 

The text in italics should be 
altered for each application. Note 
that the android:name part of the 
activity tag can be chosen freely, 
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LISTING 2 / LISTING 3 / LISTING 4 


apply plugin: 'application 1 

mainClassName = 'org.javafxports.helloworld.HelloAndroid' 
dependencies { 
compile project(':core') 

} 


Download all listings in this issue as text 


and it is not necessarily the same as 
the main.class attribute specified in 
the meta-data section of the activity 

tag. The latter has to be the name of 
the JavaFX application class. 

In the build.gradle file, we have 
to instruct the Android tools to 
take into account the javaFX 
runtime libraries (both the Java 
classes and the native libraries). 

The build.gradle file contains a 
dependency section, in which we 
have to declare that the application 
depends on the code compiled in 
the core subproject. We also have 
to add the JAR files provided by the 
Dalvik SDK (see Listing 3). 

In the android section of the 
build.gradle file, we have to specify 
the location of native libraries 
and properties (see Listing4). The 
dalvikSdkLib that is used at several 
locations in the build.gradle file 


should point to the rt/lib directory 
in the Dalvik SDK—for example: 

I defdalvikSdkLib = 
/opt/dalvik-sdk/rt/lib 

With the combination of these 
three projects, you can now run the 
JavaFX application on a desktop 
or laptop system or on a mobile 
device. Executing the command 
gradle tasks shows an overview 
of available tasks. For example, 
gradle desktop:run executes the 
application on your development 
environment. 

To create an Android debug appli¬ 
cation and send it to an Android 
device that is connected to your 
development environment, execute 

gradle android:installDebug. To cre¬ 
ate a release version for the Android 
application that you want to upload 
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to the Play Store, you have to enter 
gradle androichassembleRelease. 

JavaFX-on-Android Features 

The JavaFX runtime on Android is 
based on the code in the OpenJFX 
source repository, and it supports 
most of the features of the corre¬ 
sponding JavaFX release for desk¬ 
top platforms. For example, dalvik- 
sdk-8u40 releases are based on the 
same codebase as JavaFX 8u40 for 
desktop platforms. At this moment, 
the media framework is not sup¬ 
ported. More features of JavaFX 
have been added to the Android 
port. For example: 

■ JavaFX on Android now supports 
JavaFX 3D. 

■ New components that are intro¬ 
duced in JavaFX 8 have been 
added, including the convenient 

DatePicker control. 

■ You can use FXML to build your 
Scene. 

Although using only standard¬ 
ized Java code is best, some¬ 
times you can benefit from using 
Android-specific code. Apart from 
the standard Java classes, Android 
APIs provide a rich framework 
that allows developers to integrate 
device-specific functionality such 
as the location providers (including 
GPS), a nearfield communication 
(NFC) reader, and so on. Including 
Android APIs breaks portability, 
so use a modular approach in your 


application, where the Android- 
specific parts are implemented in 
the android subproject. 

To access the Android APIs, 
a hook into the Android sys¬ 
tem (which is offered in the 
JavaFXPorts libraries) is required. 
Your application is started auto¬ 
matically by an instance of 
javafxports.android.FXActivity, 
a regular Android Activity that 
extends the Android Context class. 
You can access this Activity, and 
consequently all functionality 
that can be reached through an 
Activity, by calling the following: 

I FXActivity activity = 
FXActivity.getInstance(); 

As an example, you can get the 
Android NFC adapter by using the 
code shown in Listing 5. 

We've seen how you can use 
Android code in a JavaFX appli¬ 
cation. In some cases, you want 
to do the opposite: You have an 
Android application, and you want 
to use JavaFX components inside 
your Android application. While 
this adds more requirements for 
the integration of user interfaces, 
it is technically possible. 

The samples repository, available 
here, contains the Kokos project, 
which is an example of doing this. 
This example was created with the 
Android Studio IDE, and it starts 
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LISTING 5 LISTING 6 



android.content.Context ctx = FXActivity.getInstance(); 
android.nfc.NfcAdapter nfcAdapter = 
NfcAdapter.getDefaultAdapter(ctx); 


Download all listings in this issue as text 


with a regular Android applica¬ 
tion. By using the Android Studio 
interface, you created a new 

<code>fragment<code> as a sub¬ 
class of javafxports.android 
.FXFragment. 

The FXFragment class is a sub¬ 
class of the Android Fragment class, 
and it allows you to run a JavaFX 
application inside its view. In the 
Kokos example, thejavafxports.org 
.kokos.MyFxApp class is a JavaFX 
application that is assigned to the 
created fragment, as shown in 
Listing 6. 

Note that the possibility of using 
JavaFX application code inside 
Android fragments is still under 
early development. Start with a 
JavaFX application whenever pos¬ 
sible, rather than starting with an 
Android application and adding a 
JavaFX application to it. 

Conclusion 

JavaFX became mature right on 
time for being part of the mobile 


revolution. Native applications on 
mobile devices are becoming an 
interesting area of development. 

The porting efforts for providing 
JavaFX on mobile devices are a Java 
community initiative, and they still 
require lots of work. To get involved 
in the porting efforts, send an 
e-mail to info@javafxports.org. 

The port of JavaFX to Android 
is a continuous effort. The build 
process has been simplified since 
this article was written, and you 
can now build iOS applications 
leveraging the same code and 
tools. Read more about this effort 
here. We wrote a Gradle plugin 
that runs your existing JavaFX 
code on desktop, Android, and 
iOS devices. Using the new javafx- 
mobile plugin is very simple; it's 
described here. </article> 
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1 THE PROBLEM 


Tue. v\tw a^\c 5 Tt«*\e API i^l^ocJ^ced \y\ Uav/a 2 \s a 

(4itA^\a4iC *i^v\p<rov/e^e.^\4 0\JtX 4U£ or\j\v\A[ \v\ 4he 3PK- 

3SR 310 no4 only 'm4^od(ACes a new API 4h*4 'is Simple^ And less 
error-prone, b<n4 i4 <*lso 'm4rod<ACes new I<4nc4iOnAl't4y 4 0 be44er 
l^Andle 4i |V '& cones And <Am4 4es4mg oT compliCA4ed d^4e logic. 

One oT 4he new CApAb'ilHies oT 4he Sava 2 PA4e And Time 
API is 4he conCep4 oT TemporAlAdyAs4ers, wb,c|^ CAn be insed 4o 
e*4ernAlize d^4e Adj<As4 lv 'en4s. Bvul4“’m TemporAlAdyAs4ers handle 
mAny common operA4iOnS SiaCI^ as -finding 4h£ Ia$ 4 dAy oT 4he 
mon4u or yeAr. B*a4 W|^a4 iI yoiA WAn4 4o do some4hmg specdic 4o 
yoi/vr Appl'|CA4'ton, SiaCUv as Tmd’tng 4h£ ne*4 occurrence oT Fr'idAy 
4h£ 134 h, ^n lAnl^cicy dAy 'm Wes4ern Si4pers4i4iOn? 


2 THE COPE 



Hint: Review a 
presentation on the 
Date and Time API. 




Here is a simple code snippe4 4h*4 deTmes a TemporAlAdyAs4er lAmbd<* 
expression 4° do eXAc4ly 4his- Given An Arb’HrAry mpt\4 d<*4t (lor eXAmple, 
LocAlPA4e or LocAlPA4eT'ime'), 4hts code should re4wrn 4he nex4 occurrence oT 
a Fr’idAy On 4he l34(n op 4he mon4h- 

TemporalAdjuster fridayl3Adjuster = temporal -> { 

temporal = temporal.with(ChronoField.DAY_OF_MONTH, 13); 
while (temporal.get(ChronoField.DAY_OF_WEEK) != 

DayOfWeek.FRIDAY.getValue()) { 

temporal = temporal.plus(1, ChronoUnit.MONTHS); 

} 

return temporal; 

} ; 

// Examples usage: 

LocalDate today = LocalDate.now(); 

LocalDate nextFridayl3 = today.adjustlnto(fridayl3Adjuster); 


3 WHAT'S THE FIX? 


Some4hmg 'is wrong wHh 4he Above TemporAlAdj<As4er- Wh^4 coinld ’i4 be? 

1) VAr'iA4'iOns in mon4h- Poes 4he code need 4o AccoiAn4 Tor d'lTTe^&nCes 'm 

mon4k leng4k W(^en 'mCremenT’ing 4ke Te^po^Al? 

2) A 4ime zone ’iss<ne- Poes 4ue code need 4® Accoinn4 To^ inp<A4 d<*4es w'^U 4"i ,v 'e 
cone componen4s? 

3) An error 'm 4l>£ T°rmiAlA. M^ybe 4Ue CAlCiAlA4’iOn oT Fr'tdAy 4^£ 134U iS 

T<And‘^ IA A£n4^lly Flawed? 

T) Some4Lmg else en4'irely? 



ART BY l-HUA CHEN 


THE ANSWER? 

Loor -fW 4ke Av\swe« A 4ue ^\ex4 ]ss^\t- 0^ svAbw\',4 yow ow^ co<de c^Ue^e! 
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